Entries in ambient (3)

Tuesday
Nov092010

Facebook Uses Non-Stored Procedures to Update Social Graphs

Facebook's Ryan Mack gave a MySQL Tech Talk where he talked about using what he called Non-stored Procedures for adding edges to Facebook's social graph. The question is: how can edges quickly be added to the social graph? The answer is ultimately one of deciding where logic should be executed, especially when locks are kept open during network hops.

Ryan explained a key element of the Facebook data model are the connections between people, things they've liked, and places they've checked-in. A lot of their writes are adding edges to the social graph. 

Currently this is a two step process, run inside a transaction:

Click to read more ...

Wednesday
Aug182010

Misco: A MapReduce Framework for Mobile Systems - Start of the Ambient Cloud?

Misco: A MapReduce Framework for Mobile Systems is a very exciting paper to me because it's really one of the first explorations of some of the ideas in Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud. What they are trying to do is efficiently distribute work across a set cellphones using a now familiar MapReduce interface. Usually we think of MapReduce as working across large data center hosted clusters. Here, the cluster nodes are cellphones not contained in any data center, but compute nodes potentially distributed everywhere.

I talked briefly with Adam Dou, one of the paper's authors, and he said they don't see cellphone clusters replacing dedicated computer clusters, primarily because of the power required for both network communication and the map-reduce computations. Large multi-terabyte jobs aren't in the cards...yet. Adam estimates computationally that cellphones are performing similarly to desktops of ten years ago. Instead, they want to focus on the unique characteristics of the mobile devices--camera, microphone, GPS and other directly collectable data--so the data can be processed where collected.

MapReduce was selected as the programming interface because it is familiar to programmers, it transparently supports programming multiple devices, and can be implemented--especially using Python---in such a way that programmers are freed from all the underlying details like concurrency, data distribution, and code management. A very smart move in my estimation. 

It's interesting to contrast the economics of the ambient cloud to the economics of the data center cloud. The goal of a data center cloud is 100 percent utilization. Use every possible CPU cycle or money is being wasted money on unused equipment. In an ambient cloud the idea is more parasitic, deploy to more resources yet leave the primary function of the device unaffected. It's a different perspective that may lead to different architectures.

A quick introduction to Misco from the abstract:

Click to read more ...

Wednesday
Dec162009

Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud

"But it is not complicated. [There's] just a lot of it." 
-- Richard Feynman on how the immense variety of the world arises from simple rules.

Contents:

  1. Have We Reached the End of Scaling?
  2. Applications Become Black Boxes Using Markets to Scale and Control Costs
  3. Let's Welcome our Neo-Feudal Overlords
  4. The Economic Argument for the Ambient Cloud
  5. What Will Kill the Cloud?
  6. The Amazing Collective Compute Power of the Ambient Cloud
  7. Using the Ambient Cloud as an Application Runtime
  8. Applications as Virtual States
  9. Conclusion

We have not yet begun to scale. The world is still fundamentally disconnected and for all our wisdom we are still in the earliest days of learning how to build truly large planet-scaling applications.

Today 350 million users on Facebook is a lot of users and five million followers on Twitter is a lot of followers. This may seem like a lot now, but consider we have no planet wide applications yet. None.

Tomorrow the numbers foreshadow a new Cambrian explosion of connectivity that will look as different as the image of a bare lifeless earth looks to us today. We will have 10 billion people, we will have trillions of things, and we will have a great multitude of social networks densely interconnecting all these people to people, things to things, and people to things.

How can we possibly build planet scalable systems to handle this massive growth if building much smaller applications currently stresses architectural best practices past breaking? We can't. We aren't anywhere close to building applications at this scale, except for perhaps Google and a few others, and there's no way you and I can reproduce what they are doing. Companies are scrambling to raise hundreds of millions of dollars in order to build even more datacenters. As the world becomes more and more global and more and more connected, handling the load may require building applications 4 or 5 orders of magnitude larger than any current system. The cost for an infrastructure capable of supporting planet-scale applications could be in the 10 trillion dollar range (very roughly estimated at $100 million a data center times 10K).

If you aren't Google, or a very few other companies, how can you possibly compete? For a glimmer of a possible direction that may not require a kingdom's worth of resources, please take a look at this short video:

Click to read more ...