Entries in Paper (127)

Wednesday
Sep282011

Pursue robust indefinite scalability with the Movable Feast Machine

And now for something completely different, brought to you by David Ackley and Daniel Cannon in their playfully thought provoking paper: Pursue robust indefinite scalability, wherein they try to take a fresh look at neural networks, starting from scratch.

What is this strange thing called indefinite scalability? They sound like words that don't really go together:

Indefinite scalability is the property that the design can support open-ended computational growth without substantial re-engineering, in as strict as sense as can be managed. By comparison, many computer, algorithm, and network designs -- even those that address scalability -- are only finitely scalable because their scalability occurs within some finite space. For example, an in-core sorting algorithm for a 32 bit machine can only scale to billions of numbers before address space is exhausted and then that algorithm must be re-engineered.

Our idea is to expose indefinitely scalable computational power to programmers using reinvented and restricted—but still recognizable—concepts of sequential code, integer arithmetic, pointers, and user-defined classes and objects. Within the space of indefinitely scalable designs, consequently, we prioritize programmability and software engineering concerns well ahead of either theoretical parsimony or maximally efficient or flexible tile hardware design.

It's easy to read indefinite as "infinite" here. So what would such machine look like? Well, they've built the Movable Feast Machince as an implementation of their ideas:

Click to read more ...

Thursday
Sep152011

Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter

In It's Time for Low Latency  Stephen Rumble et al. explore the idea that it's time to rearchitect our stack to live in the modern era of low-latency datacenter instead of high-latency WANs. The implications for program architectures will be revolutionary.  Luiz André Barroso, Distinguished Engineer at Google, sees ultra low latency as a way to make computer resources, to be as much as possible, fungible, that is they are interchangeable and location independent, effectively turning a datacenter into single computer.

 Abstract from the paper:

The operating systems community has ignored network latency for too long. In the past, speed-of-light delays in wide area networks and unoptimized network hardware have made sub-100µs round-trip times impossible. However, in the next few years datacenters will be deployed with low-latency Ethernet. Without the burden of propagation delays in the datacenter campus and network delays in the Ethernet devices, it will be up to us to finish the job and see this benefit through to applications. We argue that OS researchers must lead the charge in rearchitecting systems to push the boundaries of low latency datacenter communication. 5-10µs remote procedure calls are possible in the short term – two orders of magnitude better than today. In the long term, moving the network interface on to the CPU core will make 1µs times feasible.

Click to read more ...

Thursday
Aug182011

Paper: The Akamai Network - 61,000 servers, 1,000 networks, 70 countries 

Update: as of the end of Q2 2011, Akamai had 95,811 servers deployed globally.

Akamai is the CDN to the stars. It claims to deliver between 15 and 30 percent of all Web traffic, with major customers like Facebook, Twitter, Apple, and the US military. Traditionally quite secretive, we get a peek behind the curtain in this paper: The Akamai Network: A Platform for High-Performance Internet Applications by Erik Nygren, Ramesh Sitaraman, and Jennifer Sun. 

Abstract:
Comprising more than 61,000 servers located across nearly 1,000 networks in 70 countries worldwide, the Akamai platform delivers hundreds of billions of Internet interactions daily, helping thousands of enterprises boost the performance and reliability of their Internet applications. In this paper, we give an overview of the components and capabilities of this large-scale distributed computing platform, and offer some insight into its architecture, design principles, operation, and management. 

Delivering applications over the Internet is a bit like living in the Wild West, there are problems: Peering point congestion, Inefficient communications protocols, Inefficient  routing  protocols, Unreliable networks, Scalability, Application limitations and a slow rate of change adoption. A CDN is the White Hat trying to remove these obstacles for enterprise customers. They do this by creating a delivery network that is a virtual network over the existing Internet. The paper goes on to explain how they make this happen using edge networks and a sophisticated software infrastructure. With such a powerful underlying platform, Akamai is clearly Google-like in their ability to deliver products few others can hope to match.

Detailed and clearly written, it's well worth a read.

Click to read more ...

Wednesday
Jul272011

Making Hadoop 1000x Faster for Graph Problems

Dr. Daniel Abadi, author of the DBMS Musings blog and Cofounder of Hadapt, which offers a product improving Hadoop performance by 50x on relational data, is now taking his talents to graph data in Hadoop's tremendous inefficiency on graph data management (and how to avoid it), which shares the secrets of getting Hadoop to perform 1000x better on graph data.

TL;DR:

Click to read more ...

Tuesday
May312011

Awesome List of Advanced Distributed Systems Papers

As part of Dr. Indranil Gupta's CS 525 Spring 2011 Advanced Distributed Systems class, he has collected an incredible list of resources on distributed systems. His research group is also doing some interesting work.

The various topics include: Before there Were Clouds, Cloud Computing, P2P Systems, Basic Distributed Computing Concepts, Sensor Networks, Overlays and DHTs, Cloud Programming, Cloud Scheduling, Key-Value Stores, Storage, Sensor Net Routing, Geo-Distribution, P2P Apps, In-network processing, Epidemics, Probabilistic Membership Protocols, Distributed Monitoring and  Management, Publish-Subscribe/CDNs, Measurement Studies, Old Wine: Stale or Vintage?, In Byzantium, Cloud Pricing, Other Industrial Systems, Structure of Networks, Completing the Circle, Green Clouds, Distributed Debugging, Flash!, The Middle or the End?, Availability-Aware Systems, Design Methodologies, Handling Stress, Sources of unreliability in networks, Handling Stress, Selfish algorithms, Security, Economic Theory, The future of sensor nets?, The End-to-End Approach, Automatic Computing and Inference, Caching, Classical Algorithms, Topology and Naming, Practical theory perspectives, Modular Systems.

That's just the list of topics! For every topic there's the slide deck used to teach the class, a main list of papers and a second list of optional papers. So there's a lot to choose from. Happy reading! If any of the papers really stand out for you, please share.

Click to read more ...

Thursday
May122011

Paper: Mind the Gap: Reconnecting Architecture and OS Research

Mind the Gap: Reconnecting Architecture and OS Research is a paper presented at HotOS XIII, the place where researchers talk about making potential futures happen. For a great overview of the conference take a look at this article by Matt Welsh: Conference report: HotOS 2011 in Napa.

In the VM/cloud age I question the need of having an OS at all, programs can compile directly against "raw" hardware, but the paper does a good job of trying to figure out the new roll operating systems can play in the future. We've been in a long OS holding pattern, so long that we've seen the rise of PaaS vendors skipping the OS level abstraction completely, but there's room for a middle ground between legacy time sharing systems of the past and service level APIs that are but one possible future.

Introduction:

Click to read more ...

Thursday
May052011

Paper: A Study of Practical Deduplication

With BigData comes BigStorage costs. One way to store less is simply not to store the same data twice. That's the radically simple and powerful notion behind data deduplication. If you are one of those who got a good laugh out of the idea of eliminating SQL queries as a rather obvious scalability strategy, you'll love this one, but it is a powerful feature and one I don't hear talked about outside the enterprise. A parallel idea in programming is the once-and-only-once principle of never duplicating code.

Using deduplication technology, for some upfront CPU usage, which is a plentiful resource in many systems that are IO bound anyway, it's possible to reduce storage requirements by upto 20:1, depending on your data, which saves both money and disk write overhead. 

This comes up because of really good article Robin Harris of StorageMojo wrote, All de-dup works, on a paper,  A Study of Practical Deduplication by Dutch Meyer and William Bolosky, 

For a great explanation of deduplication we turn to Jeff Bonwick and his experience on the ZFS Filesystem:

Click to read more ...

Wednesday
Apr132011

Paper: NoSQL Databases - NoSQL Introduction and Overview

Christof Strauch, from Stuttgart Media University, has written an incredible 120+ page paper titled NoSQL Databases as an introduction and overview to NoSQL databases . The paper was written between 2010-06 and 2011-02, so it may be a bit out of date, but if you are looking to take in the NoSQL world in one big gulp, this is your chance. I asked Christof to give us a  short taste of what he was trying to accomplish in his paper:

Click to read more ...

Thursday
Apr072011

Paper: A Co-Relational Model of Data for Large Shared Data Banks

Let's play a quick game of truth or sacrilage: are SQL and NoSQL are really just two sides of the same coin? That's what Erik Meijer and Gavin Bierman would have us believe in their "we can all get along and make a lot of money" article in the Communications of the ACM, A Co-Relational Model of Data for Large Shared Data Banks. You don't believe it? It's math, so it must be true :-) Some key points:

In this article we present a mathematical data model for the most common noSQL databases—namely, key/value relationships—and demonstrate that this data model is the mathematical dual of SQL's relational data model of foreign-/primary-key relationships

...we believe that our categorical data-model formalization and monadic query language will allow the same economic growth to occur for coSQL key-value stores.

...In contrast to common belief, the question of big versus small data is orthogonal to the question of SQL versus coSQL. While the coSQL model naturally supports extreme sharding, the fact that it does not require strong typing and normalization makes it attractive for "small" data as well. On the other hand, it is possible to scale SQL databases by careful partitioning.
What this all means is that coSQL and SQL are not in conflict, like good and evil. Instead they are two opposites that coexist in harmony and can transmute into each other like yin and yang. Because of the common query language based on monads, both can be implemented using the same principles.

I'm certainly in no position to judge this work, or what it means at some deep level. After reading a 1000 treatments on monads I still have no idea what they are. But, like the Standard Model in physics, it would be satisfying if some unifying principles underlay all this stuff. Would we all get along? That's a completely different question...

Thursday
Mar172011

Are long VM instance spin-up times in the cloud costing you money?

Are long VM instance spin-up times in the cloud costing you money? That's the question that immediately came to mind when James Urquhart, in an interview at the Stata Conference, made this thought provoking comment: the faster you can get the resources into the hands of the people who use them, the more money you save overall.

One of the many super powers of the cloud is elasticity, the ability to dynamically acquire and release resources in response to demand. But like any good superhero, their strength must also form the basis of a not quite fatal flaw. Years and years of angsty episodes are usually required to explore this contradiction.

In the case of the cloud, the weakness reveals itself in slow VM spin-up times. Spinning up a VM in EC2 can take a little as 1-3 minutes, or can average 5-10 minutes, or it can take much longer if there's heavy usage in your availability zone. EC2 is not alone. A common complaint about Google App Engine is the cold-start problem. When a request comes in, an application must be initialized to handle it, which takes time, which means the end-user experiences increased latency.

For the rest of the article please click through...

Click to read more ...

Page 1 ... 4 5 6 7 8 ... 13 Next 10 Entries »