Entries in Paper (127)

Wednesday
Feb162011

Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming

Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming, which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found:

  1. Each video is encoded in five versions at different bit rates and stored in separate files.
  2. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. 
  3. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant.
  4. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay.
  5. For a sudden increase of the available bandwidth, the transient time to match the new bandwidth is roughly 150 seconds.

Abstract:

Click to read more ...

Wednesday
Feb022011

Piccolo - Building Distributed Programs that are 11x Faster than Hadoop

Piccolo (not this or this) is a system for distributed computing, Piccolo is a new data-centric programming model for writing parallel in-memory applications in data centersUnlike existing data-flow models, Piccolo allows computation running on different machines to share distributed, mutable state via a key-value table interface. Traditional data-centric models (such as Hadoop) which present the user a single object at a time to operate on, Piccolo exposes a global table interface which is available to all parts of the computation simultaneously. This allows users to specify programs in an intuitive manner very similar to that of writing programs for a single machine.

Using an in-memory key-value store is a very different approach from the canonical map-reduce, which is based on using distributed file systems. The results are impressive:

Experiments have shown that Piccolo is fast and pro-vides excellent scaling for many applications. The performance of PageRank and k-means on Piccolo is 11×and 4× faster than that of Hadoop. Computing a PageR-ank iteration for a 1 billion-page web graph takes only 70 seconds on 100 EC2 instances. Our distributed webcrawler can easily saturate a 100 Mbps internet uplink when running on 12 machines.

Piccolo was presented at OSDI10. For the paper take a look at Piccolo: Building Fast, Distributed Programs with Partitioned Tables, here's the slide deck, and there's a video of the talk (very good).

Click to read more ...

Thursday
Jan272011

Comet - An Example of the New Key-Code Databases

Comet is an active distributed key-value store built at the University of Washington. The paper describing Comet is Comet: An active distributed key-value store, there are also slides, and a MP3 of a presentation given at OSDI '10. Here's a succinct overview of Comet:

Today's cloud storage services, such as Amazon S3 or peer-to-peer DHTs, are highly inflexible and impose a variety of constraints on their clients: specific replication and consistency schemes, fixed data timeouts, limited logging, etc. We witnessed such inflexibility first-hand as part of our Vanish work, where we used a DHT to store encryption keys temporarily. To address this issue, we built Comet, an extensible storage service that allows clients to inject snippets of code that control their data's behavior inside the storage service.

I found this paper quite interesting because it takes the initial steps of collocating code with a key-value store, which turns it into what might called a key-code store. This is something I've been exploring as a way of moving behavior to data in order to overcome network limitations in the cloud and provide other benefits. An innovator in this area is the Alchemy Database, which has already combined Redis and Lua. A good platform for this sort of thing might be Node.js integrated with V8. This would allow complex Javascript programs to run in an efficient evented container. There are a lot of implications of this sort of architecture, more about that later, but the Comet paper describes a very interesting start.

From the abstract and conclusion:

Click to read more ...

Tuesday
Jan182011

Paper: Relational Cloud: A Database-as-a-Service for the Cloud

The Relational Cloud Project is an effort by a group of researchers at MIT to investigate technologies and challenges related to Database-as-a-Service within cloud-computing. They are trying to figure out how the advantages of the DaaS (Database-as-a-Service) model, that we've seen arise in other areas like OLAP and NoSQL, can be applied to relational databases. The DaaS advantages as they see them are: 1) predictable costs, proportional to the quality of service and actual workloads, 2) lower technical complexity, thanks to a unified and simplified service access interface, and 3) virtually infinite resources ready at hand. An interesting description of their approach is explained in the paper Relational Cloud: A Database-as-a-Service for the Cloud. From the abstract:

Click to read more ...

Tuesday
Jan112011

Google Megastore - 3 Billion Writes and 20 Billion Read Transactions Daily

A giant step into the fully distributed future has been taken by the Google App Engine team with the release of their High Replication Datastore. The HRD is targeted at mission critical applications that require data replicated to at least three datacenters, full ACID semantics for entity groups, and lower consistency guarantees across entity groups.

This is a major accomplishment. Few organizations can implement a true multi-datacenter datastore. Other than SimpleDB, how many other publicly accessible database services can operate out of multiple datacenters? Now that capability can be had by anyone. But there is a price, literally and otherwise. Because the HRD uses three times the resources as Google App Engine's Master/Slave datastatore, it will cost three times as much. And because it is a distributed database, with all that implies in the CAP sense, developers will have to be very careful in how they architect their applications because as costs increased, reliability increased, complexity has increased, and performance has decreased. This is why HRD is targeted ay mission critical applications, you gotta want it, otherwise the Master/Slave datastore makes a lot more sense.

The technical details behind the HRD are described in this paper, Megastore: Providing Scalable, Highly Available Storage for Interactive Services. This is a wonderfully written and accessible paper, chocked full of useful and interesting details. James Hamilton wrote an excellent summary of the paper in Google Megastore: The Data Engine Behind GAE. There are also a few useful threads in Google Groups that go into some more details about how it works, costs, and performance (the original announcement, performance comparison).

Some Megastore highlights:

Click to read more ...

Thursday
Dec232010

Paper: CRDTs: Consistency without concurrency control

For a great Christmas read forget The Night Before Christmas, a heart warming poem written by Clement Moore for his children, that created the modern idea of Santa Clause we all know and anticipate each Christmas eve. Instead, curl up with a some potent eggnog, nog being any drink made with rum, and read CRDTs: Consistency without concurrency control by Mihai Letia, Nuno Preguiça, and Marc Shapiro, which talks about CRDTs (Commutative Replicated Data Type), a data type whose operations commute when they are concurrent.

From the introduction, which also serves as a nice concise overview of distributed consistency issues:

Click to read more ...

Friday
Dec032010

GPU vs CPU Smackdown : The Rise of Throughput-Oriented Architectures

In some ways the original Amazon cloud, the one most of us still live in, was like that really cool house that when you stepped inside and saw the old green shag carpet in the living room, you knew the house hadn't been updated in a while. The network is a little slow, the processors are a bit dated, and virtualization made the house just feel smaller. It has been difficult to run high bandwidth or low latency workloads in the cloud. Bottlenecks everywhere. Not a big deal for most applications, but for many high performance applications (HPC) it was a killer.

In a typical house you might just do a remodel. Upgrade a few rooms. Swap out builder quality appliances with gleaming stainless steel monsters. But Amazon has a big lot, instead of remodeling they simply keep adding on entire new wings, kind of like the Winchester Mystery House of computing.

The first new wing added was a CPU based HPC system featuring blazingly fast Nehalem chips, virtualization replaced by a close to metal Hardware Virtual Machine (HVM) architecture, and the network is a monster 10 gigabits with the ability to specify placement groups to carve out a low-latency, high bandwidth cluster. Bottlenecks removed. Most people still probably don't even know this part of the house exists.

The newest addition is a beauty, it's a graphics processing unit (GPU) cluster as described by Werner Vogels in Expanding the Cloud - Adding the Incredible Power of the Amazon EC2 Cluster GPU Instances . It's completely modern and contemporary. The shag carpet is out. In are Nvidia M2050 GPU based clusters which make short work of applications in the sciences, finance, oil & gas, movie studios and graphics.

Click to read more ...

Tuesday
Nov092010

Paper: Hyder - Scaling Out without Partitioning 

Partitioning is what differentiates scaling-out from scaling-up, isn't it? I thought so too until I read Pat Helland's blog post on Hyder, a research database at Microsoft, in which the database is the log, no partitioning is required, and the database is multi-versioned. Not much is available on Hyder. There's the excellent summary post from Mr. Helland and these documents: Scaling Out without Partitioning and Scaling Out without Partitioning  - Hyder Update by Phil Bernstein and Colin Reid of Microsoft.

The idea behind Hyder as summarized by Pat Helland (see his blog for the full post):

Click to read more ...

Friday
Oct222010

Paper: Netflix’s Transition to High-Availability Storage Systems 

In an audacious move for such an established property, Netflix is moving their website out of the comfort of their own datacenter and into the wilds of the Amazon cloud. This paper by Netflix's Siddharth “Sid” Anand, Netflix’s Transition to High-Availability Storage Systems, gives a detailed look at this transition and does a deep dive on SimpleDB best practices, focussing especially on techniques useful to those who are making the move from a RDBMS.

Sid is going to give a talk at QCon based on this paper and he would appreciate your feedback. So if you have any comments or thoughts please comment here or email Sid at r39132@hotmail.com or Twitter at @r39132 Here's the introduction from the paper:

Click to read more ...

Thursday
Oct212010

Machine VM + Cloud API - Rewriting the Cloud from Scratch

Write a little "Hello World" program these days and it runs inside a bewildering Russian Doll of nested environments, each layer adding its own special performance and complexity tax. First, a language executes in its own environment of data structure libraries, memory management, and so on. That, more often than not, will run inside a language VM like the JVM, CLR, or V8. The language VM will in-turn run inside a process that runs inside an OS. An application will run in one or more threads inside a process. And the whole thing will run inside a machine sharing VM layer like Xen. And across all of that are frameworks for monitoring, elasticity, storage, and so on. That's a lot of overhead for a such a little program.

What if we could remove all these taxes and run directly on the new bare metal, which some consider to be a combination of Machine VM + Cloud API? That's exactly what a system called Mirage, described in the paper Turning down the LAMP: Software Specialisation for the Cloud, sets out to do by treating the cloud virtual hardware as a compiler target, and converting high-level language source code directly into kernels that run on it.

Click to read more ...

Page 1 ... 5 6 7 8 9 ... 13 Next 10 Entries »