« Scale out your identity management | Main | Hot Scalability Links for January 28 2010 »
Monday
Feb012010

What Will Kill the Cloud?

This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.

If datacenters are the new castles, then what will be the new gunpowder? As soon as gunpowder came on the scene, castles, which are defensive structures, quickly became the future's cold, drafty hotels. Gunpowder fueled cannon balls make short work of castle walls.

There's a long history of "gunpowder" type inventions in the tech industry. PCs took out the timeshare model. The cloud is taking out the PC model. There must be something that will take out the cloud.

Right now it's hard to believe the cloud will one day be no more. They seem so much the future, but something will transcend the cloud. We even have a law that says so: Bell's Law of Computer Classes which holds that roughly every decade a new, lower priced computer class forms based on a new programming platform, network, and interface resulting in new usage and the establishment of a new industry. A computer class in this context is defined as a set of computers in a particular price range with unique or similar programming environments (e.g. Linux, OS/360, Palm, Symbian, Windows) that support a variety of applications that communicate with people and/or other systems.

We've been through a few computer evolutions before. Here's a list:

  1. Mainframes (1960s)
  2. Minicomputers (1970s)
  3. PCs and Local Area Networks (1980s)
  4. Datacenter is the Computer (1990s)
  5. Smartphones (2000s)
  6. Wireless Sensor Networks (>2005)
  7. Body Area Networks (> 2010). These are dust sized chips with a relatively small numbers of transistors enable the creation of ubiquitous, radio networked, implantable, sensing platforms to be part of everything and everybody as a wireless sensor network class. Field Programmable Logic Array chips with 10s-100s of million cells exist as truly universal devices for building “anything”.

The first part of this list may be somewhat familiar. Though hardly anyone living has seen a mainframe or minicomputer, they really did indeed exist, much like the dinosaur. Companies like IBM, Honeywell, HP and DEC dominated these eras.

After mainframes and minis came personal computers. Much like the production of affordable cars brought us the freedom of the open road, PCs freed us from centralized bureaucracies and allowed us to travel the Internets without restriction. Companies like IBM, Microsoft, Apple, and Dell dominated this era. This also happens to be the era we are just leaving. Please wave a hearty and warm good bye.

Fickle beings that we are, it turns out we want both the freedom of an open mobile road and centralized IT. PCs became too complex and way more powerful than even a hardened gamer can exploit. This has brought on the era of the cloud. The cloud treats large clusters of powerful networked computers like a single abstraction, thus the Datacenter is the Computer. Notice how clouds combine aspects of mainframes, minis, and networked PCs into a whole new thing. We've been entering the cloud era for a while, but they are really just starting to take off. Companies like Google, Facebook, and Amazon are dominating the cloud currently. Microsoft, IBM, and others are trying like heck not to get left behind.

At one point I thought the cloud was just a transitory phase until we took the next technological leap. This is still true in the far future, but for the long now I see the cloud not as a standalone concentration of dense resources, but as one part of a diffuse overlay constructed from the cloud and the next three generations of Bell's Classes: Smartphone, Wireless Sensor Network, and Body Area Network.

While we have cloud masters creating exquisite platforms for developers, we haven't had much progress on mastering smartphones, wireless sensor networks, and body area networks. Why this would be isn't hard to understand. Smart phones are just coming into their own as computer platforms. Wireless sensor networks hardly even exist yet. And body area networks don't exist yet at all.

There's also a capability problem. What would kill the cloud is to move the characteristics of the cloud outside the datacenter. Create super low latency, super high bandwidth networks, using super fast CPUs and super dense storage - that would be a cannon shot straight through the castle walls of the datacenter.

The likelihood of this happening is slim. Part of the plan, super fast CPUs and storage are well on their way. What we, in the US at least, won't have are wide spread super low latency and super high bandwidth connections. These exist between within datacenters, between datacenters, and as backhaul networks to datacenters, but on a point-to-point basis the US bandwidth picture is grim.

Nielsen’s Law holds that Internet bandwidth grows at an annual rate of 50 percent, while capacity grows at 60 percent. A seemingly small difference, but over a 10-year time period this means computer power grows 100x, but bandwidth only grows at 57x. So network speeds won't grow as quickly as servers can be added. Ironically, this is the same problem chip designers are having with multi-core systems. Chip designs can add dozens of processor cores, but system bus speeds are not keeping pace so all that computational power goes under utilized.

What Bell's 3 Classes do provide in abundance is almost unimaginable parallelism. Fortunately for us the absolute driving force of scale is parallelism. So it seems what is needed to create an open, market driven Ambient Cloud is an approach that exploits massive parallelism, fast CPUs, large pools of fast storage, and a frustratingly slow and unreliable network. In later sections we'll see if this may be possible.

If this sounds impossible take a look at Futurist Ray Kurzweil's explanation of The Law of Accelerating Returns. The basic idea is that change happens much faster than we think: An analysis of the history of technology shows that technological change is exponential, contrary to the common-sense "intuitive linear" view. So we won't experience 100 years of progress in the 21st century -- it will be more like 20,000 years of progress (at today's rate).

The lesson is if you think all the stuff is really far off in the future, it's actually just around the corner.

The Amazing Collective Compute Power of the Ambient Cloud

Earlier we talked about how a single botnet could harness more compute power than our largest super computers. Well, that's just the start of it. The amount of computer power available to the Ambient Cloud will be truly astounding.

2 Billion Personal Computers

The number of personal computers is still growing. By 2014 one estimate is there will be 2 billion PCs. That's a giant reservoir of power to exploit, especially considering these new boxes are stuffed with multiple powerful processors and gigabytes of memory.

7 Billion Smartphones

By now it's common wisdom smartphones are the computing platform of the future. It's plausible to assume the total number of mobile phones in use will roughly equal the number of people on earth. That's 7 billion smartphones.

Smartphones aren't just tiny little wannabe computers anymore either. They are real computers and are getting more capable all the time. The iPhone 3GS, for example, would have qualified as a supercomputer a few decades ago. It runs at 600 MHz, has 256MB of RAM, and 32 GB of storage. Not bad at all. In a few more iterations phones will be the new computer.

The iPhone is not unique. Android, Symbian, BlackBerry, and Palm Pre all going the same direction. Their computing capabilities will only increase as smartphones are fit with more processors and more graphics processors. Innovative browsers and operating systems are working on ways of harnessing all the power.

Tilera founder Anant Agarwal estimates by 2017 embedded processors could have 4,096 cores, server CPUs might have 512 cores and desktop chips could use 128 cores. Some disagree saying this too optimistic, but Agarwal maintains the number of cores will double every 18 months.

That's a lot of cores. That's a lot of compute power. That's a lot of opportunity.

It's not just cores that are on the rise. Memory, following an exponential growth curve, will be truly staggering. One Google exec estimates that in 12 years an iPod will be able to store all the video ever produced.

But all the compute power in the world is of little use if the cores can't talk to each other.

The Cell Network Bandwidth and Latency Question

Aren't cell phone networks slow and have high latencies? Currently cell networks aren't ideal. But the technology trend is towards much higher bandwidth cell connections and reasonable latencies.

Here's a rundown of some of the available bandwidths and latencies. For cell connections your milleage may vary considerably as cell performance changes on a cell-by-cell manner according to the individual site's local demographics, projected traffic demand and the target coverage area of the cell.

  1. WiFi networks provide latencies on the order of 1 to 10 milliseconds at 1.9 Mbps (megabits per second).
  2. Ping times on the EDGE network are reported to be in the 700 to 1500ms range at 200 Kbps to 400 Kbps (often much lower).
  3. In New York HSPDA (High-Speed Downlink Packet Access, which is 3.5G type nework) has latencies between 100 to 250 milliseconds and 300KB/s of bandwidth. Note that's bytes, not bits.
  4. Amazon has average latencies of 1ms across availability zones and between .215ms and .342ms within availability zones.
  5. In 2012 AT&T plans to move to a 4G network based on LTE (Long Term Evolution). LTE is projected to offer 50 to 100 Mbps of downstream service at 10ms latency.
  6. On a 1Gbps network a 1000 bit packet takes about 1 microsecond to transmit.
  7. Within datacenters using high bandwidth 1-100Gbps interconnects the latency is less than < 1ms within a rack and less than 5ms across a datacenter. Between datacenters the bandwidth is far less at 10Mbps-1Gbps and latency is in the 100s of ms realm.

Current bandwidth rates and latencies for cell networks don't make them a sturdy a platform on which to build a cloud. Projected out they are looking pretty competitive, especially compared to the between datacenter numbers, which is the key competition for future cloud applications.

True HPC (high performance computing) low-latency-interconnect applications won't find a cell based cloud attractive at all. It's likely they will always host in specialty clouds. But for applications that can be designed to be highly parallel and deal with short latencies, cell based clouds look attractive. Starting 10,000 jobs in parallel and getting the answers in the 100ms range will work for a lot of apps as this is how applications are structured today. Specialty work can be directed to specialty clouds and the results merged in as needed.

Won't faster networking and more powerful processors use more power? Aye, there's the rub. But as we've seen with the new iPhone it's possible to deliver more power and a longer battery life with more efficient hardware and better software. Inductive chargers will also make it easier to continually charge devices. Nokia is working on wireless charging. And devices will start harvesting energy from the surroundings. So it looks like the revolution will be fully powered.

Smart Grid 1,000 Times Larger than the Internet?

Another potential source of distributed compute power are sensors on smart grids. Literally billions of dollars are being invested into developing a giant sensor grids to manage power. Other grids will be set up for water, climate, pollution, terrorist attacks, traffic, and virtually everything else you can think to measure and control.

There could be 50 billion devices on the internet. Others predict the smart grid could be 1,000 times larger than the Internet.

While these are certainly just educated guesstimates about the potential size of the smart grid, it's clear it forms another massive potential compute platform for the cloud.

33 Million Servers (and growing) in the Traditional Cloud

According to the IDC as of 2007 there were 30.3 million servers in the world. Is the number 50 million now? Will it be 100 million in 5 years? New datacenters continually come on-line, so there's no reason to expect the total to stop growing. The Ambient Cloud has full access to these servers as well as non-datacenter computer resources.

Body Area Networks

Body Area Networks are the last of Bell's set of computer classes and are probably the least familiar of the group. BANs are sensor networks in and around the human body. They are so strange sounding you may even be skeptical that they exist, but they are real. There's an BAN IEEE Task Group and there's even a cool BAN conferences in Greece. You can't get much realer than that.

Clearly this technology has obvious health and medical uses, and it may also figure into consumer and personal entertainment.

Where do BANs fit into the Ambient Cloud? There are billions of humans and with multiple processors per human and communication network, it will be possible to integrate another huge pool of compute resources into the larger grid.

What if smartphones become the cloud?

Let's compare the collective power of PCs + smartphones + smart grid + smart everything + BANs with the traditional cloud: it's trillions against many 10s of millions. This number absolutely dwarfs the capacity of the traditional cloud.

One author wrote we'll be all set when smartphones can finally sync to the cloud. What if that's backwards? What if instead smartphones become the cloud?

Texai

It's really hard to get feel for what having all this distributed power means. As a small example take a look at Texai, an open source project to create artificial intelligence. It estimates that if one hundred thousand volunteers and users worldwide download their software and donate computer resources, then assuming an average system configuration of 2 cores, 4 GB RAM and 30 GB available disk, they have a potential Texai peer-to-peer aggregate volunteered processing power of: 200,000 cores, 400 TB RAM, 30 PB disk. A stunning set of numbers. I was going to calculate the cost of that in Amazon, but I decided not to bother. It's a lot.

SETI@home as CPU Cycle User

Of course we knew all this already. SETI@home, for example, has been running since 1999. With 5.2 million participants SETI@home now has the staggering ability to compute over 528 TeraFLOPS. Blue Gene, one of the world's fastest supercomputers, peaks at just over 596 TFLOPS. And there are many many more distributed computing projects like SETI@home supplying huge amounts of compute power to their users.

Plura Processing. Their technology lets visitors to participating webpages become nodes in a distributed computing network. Customer buy time on this network to perform massively distributed computations at over a 1/10th the cost of running the same computation on a cluster or in the cloud. Nobody goes hungry at a pot luck.

An example Plura customer is 80legs. 80legs has released an innovative web crawling infrastructure using Plura that can crawl the web for the low low price of $2 per million pages using a network of 50,000 computers. It's cheap because those computers already have excess capacity that can easily be loaned without noticeable degradation.

Exploiting all that Capacity

In the future compute capacity will be everywhere. This is one of the amazing gifts of computer technology and also why virtualization has become such a hot datacenter trend.

It's out of that collective capacity that an Ambient Cloud can be formed, like a galaxy is formed from interstellar dust. We need to find a more systematic way of putting it to good use. Plura is an excellent example of how these resources can be used as a compute grid, the next step is think of all these resources can be used as an application runtime.

Nicholas Carr reminds us in the The coming of the megacomputer, we might not even be able to imagine what can created with all our new toys:

Every time there’s a transition to a new computer architecture, there’s a tendency simply to assume that existing applications will be carried over (ie, word processors in the cloud). But the new architecture actually makes possible many new applications that had never been thought of, and these are the ones that go on to define the next stage of computing.

If you would like to read the rest of the article please take a look at Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.

Reader Comments (7)

overwhelming o_O

February 1, 2010 | Unregistered Commenterjj

That's a great vision for the future, but for the sake of a Devil's Advocate argument, I'll offer a different view. This is intended in the spirit of discussion, not to contradict your excellent blog post.

I see cloud computing as a new episode of a story we've seen several times before. It goes by a different name every few years: grid computing, network computing, peer-to-peer computing, distributed computing. Each time this paradigm comes on the scene, it's a bit more sophisticated, but essentially the same in that it divides a computing task up, uses multiple processors to do the work, and then reunites the results. The hope is that this architecture is transparent.

The motivation for these architectures is that we have computing work that's more complex than a single host can perform, at least in the time frame we want.

Why did these architectures fade away? Because as you cite, hardware does increase in power faster than bandwidth. It's complex to run jobs in distributed networks of computers. If a single computer can do the job fast enough, that's better because it's simpler to operate, cheaper, has a lower barrier for entry, etc.

But the demands of modern computing continue to grow more sophisticated. We always outstrip the hardware capacity, until hardware experienced a quantum leap with respect to bang-for-the-buck. Then a single machine becomes more economical than a grid of computers.

That's what I've observed over the last 15 years or so. Will it continue to oscillate like that, between single-machine solutions and network solutions being the most economical? Maybe, for the short term.

February 1, 2010 | Unregistered CommenterBill Karwin

Thanks for that great summary!
Actually, this is much along the lines of the thinking of LinkedProcess that is creating an XMPP based distributed computing infrastructure for processing of Linked Data applications and many other scenarios, as outlined in this blog post on Social Computing. Distributed computing, accessible to any internet-connected device in the web of things, could then be applied to massively fault tolerant, peer- and socially controlled and connected resources. A more technical version of the argument for XMPP as the infrastructure at Matt Tuckers blog.

Cheers,

/peter neubauer

February 1, 2010 | Unregistered CommenterPeter Neubauer

Bill, contradicting is fine as I'm almost assured to be wrong, and I'm also sure nothing I said is really original, but at least I tried to make some sense of things :-)

Most all of computing is telling something to do something and getting a result back, so there is a deep way in which most solutions look alike at some level. Where I think the Ambient Cloud notion is a little different is the notion of applications rearchitecting themselves in response to a market place of resources and demands. This being driven by a combination of politics, cost, and global scaling needs. I wouldn't see scale-up being a solution for world scalable applications, but it certainly will be a powerful option for fixed scale applications for along time to come.

February 1, 2010 | Registered CommenterHighScalability Team

Peter, I have looked at the LinkedProcess site and it seemed like a cool compute grid option.

February 1, 2010 | Registered CommenterHighScalability Team

So in other words the internet will kill the cloud.

My personal belief is that "the cloud" is really made up of vendor-controlled incompatible clouds and represent the latest iteration of silver-bullet vendor lock-in.

Ian Bicking writes: "It took me a while to get a handle on what "cloud computing" really means. What I learned: don’t overthink it. It’s not magic. It’s just virtual private servers that can be requisitioned automatically via an API, and are billed on a short time cycle." (http://blog.ianbicking.org/2010/01/29/new-way-to-deploy-web-apps/)

Now, there is software out there being made to automatically provision VPS's. These' are getting dirt cheap ($6.25 a month as of 3 minutes ago with google.com), so sooner rather than later millions of VPS's will be automatically commissionable through self-deploying software (python + paramiko+fabric) + Ian Bicking's deployer (see link above) and the only thing left is DNS with an API, which DNSmadeeasy is working on--or so they promised 3 days ago; see http://twitter.com/DNSMadeEasy/status/8397032464 -- and at $65 a month you can commission 10 VPS's for less than one amazon instance, and that will come with some bandwidth already. True, the flexibility won't be there to scale up and down in minutes by hundreds of machines at a time, in reality not that many people need that on a daily basis.

Because these will most likely be built on top of very stable os's (debian stable, OracleOpenSolaris, or the likes) there won't be vendor-specific differentiation so anybody who can throw together a few hundred machines with XEN or OpenVZ can be a player in the field, as long as they can provide low-cost bandwidth.

And all this won't take long. I think we'll see a working solution available before the end of the year that provide vendor-agnostic cloud deployment that allows for multi-cloud application deployments.

And once all that is working on servers and can be pushed with a fancy script, we'll start to see long-battery-life wifi-enabled devices start to offer this sort of virtualization for server-grade os on the "commodity hardware" mobile-device, e-reader and notebooks clouds.

It's going to be interesting!

February 2, 2010 | Unregistered CommenterChristopher Mahan

Api access to DNS services seems to be a hot item recently.

Unfortunately, the requests come in without due consideration for the opportunity for collateral damage from careless, insecure, or overly agressive use of said api.

When someone asks for an api, they are really asking for another path into a data store that has to be maintained and secured in parallel with existing paths.

March 31, 2010 | Unregistered Commenterspenser

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>