Entries in cloud (63)

Tuesday
Mar172009

Sun to Announce Open Cloud APIs at CommunityOne

One of the key items Sun will be talking about in today's cloud computing announcement (at 9AM EST/6AM PST) will be Sun's opening of the APIs that we'll use for the Sun Cloud. We're making these available so that those who are interested will be able to review and comment on these APIs. Continuing our commitment to openness, we're making these APIs available via the Creative Commons Version 3.0 license. ...

Click to read more ...

Monday
Mar162009

Books: Web 2.0 Architectures and Cloud Application Architectures

I am excited about the upcoming release of two books on Web 2.0 and Cloud Application Architectures by O'Reilly. Web 2.0 Architectures (estimated release in May 2009) What entrepreneurs and information architects need to know Using several high-profile Web 2.0 companies as examples, authors Duane Nickull, Dion Hinchcliffe, and James Governor have distilled the core patterns of Web 2.0 coupled with an abstract model and reference architecture. The result is a base of knowledge that developers, business people, futurists, and entrepreneurs can understand and use as a source of ideas and inspiration. Featured architectures include Google, Flickr, BitTorrent, MySpace, Facebook, and Wikipedia. Cloud Application Architectures (estimated release in April 2009) Building Applications and Infrastructure in the Cloud This book by George Reese offers tested techniques for creating web applications on cloud computing infrastructures and for migrating existing systems to these environments. Specifically, you'll learn about the programming and system administration necessary for supporting transactional web applications in the cloud -- mission-critical activities that include orders and payments to support customers. The second book is available online at O'Reilly as a Rough Cuts Version so you might already had a chance to check it out. If so, do you like it?

Click to read more ...

Wednesday
Mar112009

Classifying XTP systems and how cloud changes which type startups will use

I try to group XTP in to two main groups, type 1 and 2 and then subdivide type 2 in to 2a and 2b. I describe how I do this grouping and then amplify it a little in the context of cloud services.

Click to read more ...

Friday
Mar062009

Cloud Programming Directly Feeds Cost Allocation Back into Software Design

Update 6: CARS = Cost Aware Runtimes and Services by William Louth.
Update 5: Damn You Google, Damn You Yahoo! Why D'Ya Do This to Us? Free accounts on a cloud platform are a constant drain of money.
Update 4: Caching becomes even more important in CPU based billing environments. Avoiding the CPU means saving money.
Update 3: An interesting simple example of this idea showed up on the Google AppEngine list. With one paging algorithm and one use of AJAX the yearly cost of the site was $1000. By changing those algorithms the site went under quota and became free again. This will make life a lot more interesting for developers.
Update 2: Business Model Influencing Software Architecture by Brandon Watson. The profitability of your project could disappear overnight on account of code behaving badly.
Update: Amazon adds Elastic Block Store at $0.10 per 1 million I/O requests. Now I need some cost minimization storage algorithms!

In the GAE Meetup yesterday a very interesting design rule came up: Design By Explicit Cost Model. A clumsy name I know, but it is explained like this:

 

If you are going to be charged for an operation GAE wants you to explicitly ask for it. This is why some automatic navigation between objects isn't provided because that will force an explicit query to be written. Writing an explicit query is a sort of EULA for being charged. Click OK in the form of a query and you've indicated that you are prepared to pay for a database operation.

Usually in programming the costs we talk about are time, space, latency, bandwidth, storage, person hours, etc. Listening to the Google folks talk about how one of their explicit design goals was to require programmers to be mindful of operations that will cost money made me realize in cloud programming cost will be another aspect of design we'll have to factor in.

Instead of asking for the Big O complexity of an algorithm we'll also have to ask for the Big $ (or Big Euro) notation so we can judge an algorithm by its cost against a particular cloud profile. Maybe something like $(CPU=1.3,DISK=3,IN-BANDWIDTH=2,OUT=BANDWIDTH=3, DB=10). You could look at the Big $ notation for algorithm and shake your head saying that approach will never work for GAE, but it could work for Amazon. Can we find a cheaper Big $? 

Typically infrastructure costs are part of the capital budget. Someone ponies up for the hardware and software is then "free" until more infrastructure is needed. The dollar cost of software design isn't usually an explicit factor considered.

Now software design decisions are part of the operations budget. Every algorithm decision you make will have dollar cost associated with it and it may become more important to craft algorithms that minimize operations cost across a large number of resources (CPU, disk, bandwidth, etc) than it is to trade off our old friends space and time.

Different cloud architecture will force very different design decisions. Under Amazon CPU is cheap whereas under GAE CPU is a scarce commodity. Applications between the two niches will not be easily ported.

Don't be surprised if soon you go into an interview and they quiz you on Big $ notation and skip the dusty old relic that is Big O notation :-)

Thursday
Mar052009

Strategy: In Cloud Computing Systematically Drive Load to the CPU

Update 2: Linear Bloom Filters by Edward Kmett. A Bloom filter is a novel data structure for approximating membership in a set. A Bloom join conserves network bandwith by exchanging cheaper, more plentiful local CPU utilization and disk IO. Update: What are Amazon EC2 Compute Units?. Cloud providers charge for CPU time in voodoo units like "compute units" and "core hours." Geva Perry takes on the quest of figuring out what these mean in real life. I attended Sebastian Stadil's AWS Training Camp Saturday and during the class Sebastian brought up a wonderfully counter-intuitive idea: CPU (EC2) costs a lot less than storage (S3, SDB) so you should systematically move as much work as you can to the CPU. This is said to be the Client-Cloud Paradigm. It leverages the well pummeled trend that CPU power follows Moore's Law while storage follows The Great Plains' Law (flat). And what sane computing professional would do battle with Sir Moore and his trusty battle sword of a law? Embedded systems often make similar environmental optimizations. CPU rich and memory poor means operate on compressed serialized data structures. Deserialized data structures use a lot of memory, so why use them? It's easy enough to create an object wrapper around a buffer. Programmers shouldn't care how their objects are represented anyway. Yet we waste ginormous amounts of time and memory uselessly transforming XML in and out of different representations. Just transport compressed binary objects around and use them in place. Serialization and deserialization happen only on access (Pimpl Idiom). It never occurred to me that in the land of AWS plenty similar "tricks" would make sense. But EC2 is a loss leader in AWS. CPU is plentiful and cheap. It's IO and storage that costs you... The implication is that in your system design you should try and use EC2 as much as possible:

  • Compress data. Saves on bandwidth and storage (the expensive bits) and uses cheaper CPU to compress/decompress.
  • Slurp data. Latency cost is higher than performing operations locally. SDB can take up to 400 msecs between data centers and 200 msecs inside the same data center. This is very slow. It's usually faster, but it can take that long. Following the more traditional serial processing path of "get a record do a record" will take forever and cost more. Slurp up all your records from SDB and farm them out to your CPU nodes to be worked on in parallel.
  • Think parallel. Do multiple operations at once on your cheap CPUs rather than serially performing high latency operations on expensive storage. With enough nodes, total execution time approaches max latency.
  • Client side joins. Pull all data from the relatively expensive SDB and perform client side joins on relatively cheap EC2 nodes.
  • Leverage SQS. It's a relatively cheap part of the ecosystem. Keeping a work queue in SDB would be far more expensive. When all the implications are fully explored it's a little different take on designing a system. I found some interesting numbers in a Slashdot thread comparing values: No persistent storage; not great value: And it's still not a great value. It seems cheap. $72/mo for a 1.7GB RAM server. Well, look at Slicehost and you can get a 2GB RAM Xen instance (same virtualization software as EC2) for $140 WITH persistent storage and 800GB of bandwidth. That doesn't sound like a great deal UNTIL you calculate what EC2 bandwidth costs. 800GB would cost you $144 at $0.18 per GB bringing the total cost to $216 ($76 more than Slicehost). That 18 cents doesn't sound like much, but it adds up. The same situation happens with Joyent. For $250 you get a 2GB RAM server from them (running under Solaris' Zones) with 10TB of bandwidth. That would cost you $1,872 with EC2. Even if you assume that you'll only use 10% of what Joyent is giving you, EC2 still comes in at a cost of $252 - and without persistent storage!

    Click to read more ...

  • Tuesday
    Jan272009

    Video: Storage in the Cloud at Joyent

    Ben Rockwood of Joyent speaks on "Storage in the Cloud" at the first OpenSolaris Storage Summit. Ben is the Director of Systems at Joyent. The Joyent Accelerators are based on OpenSolaris and ZFS. He has deep experience with OpenSolaris in the Real World.

    Click to read more ...

    Monday
    Jan122009

    Getting ready for the cloud

    This presentation illustrates how one can scale EXISTING JEE application and deploy it on Amazon cloud using GigaSpaces as the scale-out application server while: * Not having to re-write your application * Preventing lock-in to specific cloud provider * Enabling seamless portability between your local environment to cloud environment o No code or configuration change is required between the two environments o Develop local - test on the cloud o Built for iterative development

    Click to read more ...

    Friday
    Nov142008

    Useful Cloud Computing Blogs

    Update 2: Overcast: Conversations on Cloud Computing. Listened to the first two podcasts and they're doing a great job. Worth a look. The singing and dance routines are way over the top however :-) Update: 9 Sources of Cloud Computing News You May Not Know About by James Urquhart. I folded in these recommendations. Can't get enough cloud computing? Then you must really be a glutton for punishment! But just in case, here are some cloud computing resources, collected from various sources, that will help you transform into a Tesla silently flying solo down the diamond lane.

    Meta Sources

  • Cloud Computing Email List: An often lively email list discussing cloud computing.
  • Cloud Computing Blogs & Resources. An excellent and big list of cloud resources.
  • Cloud Computing Portal: A community edited database for making the vendor selection process easier.
  • List of Cloud Platforms, Providers, and Enablers.
  • datacenterknowledge.com's Recap: More than 70 Industry Blogs : A nice set of blog's for: Data Center, Web Hosting, Content Delivery Network (CDN), Cloud Computing
  • Cloud Computing Wiki: A cloud computing wiki started by participants of the cloud email list.

    Specific Blogs

  • Cloud Computing on Twitter : Geva Perry's Big List of People Who Twitter About Cloud Computing
  • Overcast: Conversations on Cloud Computing : Podcast series on cloud computing by James Urquhart and Geva Perry.
  • James Urquhart's The Wisdom of Clouds : Cloud Computing and Utility Computing for the Enterprise and the Individual. James writes great articles and has a regular can't miss links style post summarizing much of what you need need to know in cloud world.
  • http://Blog.RightScale.com: Cloud Computing. Delivered.
  • Randy Bias's Cloudscaling: State of the Art for Startups.
  • http://elasticserver.blogspot.com/: Elastic Server - CohesiveFT team blog.
  • Nicholas Carr's Roughtype : Author of The Big Switch: Rewiring the World From Edison to Google.
  • Christofer Hoff 's Rational Survivability: Ramblings about Information Survivability, Information Centricity, Risk Management and Disruptive Innovation. Oh, I have a fondness for virtualization, too..
  • Tim Freeman's Virtualization and Grid Computing: Primary developer of the Virtual Workspaces project.
  • Kent Langley's ProductionScale: Scalable Web Infrastructure and Technology Operations.
  • Kevin Jackson's Cloud Musings: Personal comments and insight on cloud computing and it relationship to net-centric warfare.
  • GoGrid Blog: Blog with product and industry news related to Cloud Computing and GoGrid.
  • John Willis' IT Management and Cloud Blog: Personal comments and podcasts.
  • Bert Armijo's Head In The Clouds: SVP at 3tera, includes product info as well as comments on industry events
  • Ross Cooney's SpoutingShite: MD of Rozmic. Cloud computing, email and spam.
  • TodoOnDemand: Blog about SaaS, Cloud Computing, On Demand Software, Business models, etc...
  • Jason Meiers' CAM Blog Monitoring composite applications for cloud computing blog.
  • Sam Johnston: Random rants about tech stuff.
  • Jian Zhen's and Michael Mucha's On SaaS
  • Dana Gardner's BriefingsDirect
  • Cloud - Web and Service Cloud
  • Virtualization and Grid Computing: On distributed computing, VMs, Globus, Xen, Nimbus, and other technology.
  • Reuven Cohen's ElasticVapor Blog. The ramblings of Reuven Cohen, co-Founder & CTO Enomaly Inc.
  • ENKI Blog: Managed Cloud Computing Blog.
  • Cirrhus9's and M-E Consulting's Working in the Cloud: Cloud computing solutions for the world - or at least for Southern California.
  • Craig Balding's Cloud Security Blog: This blog is dedicated to Cloud Computing and Security.
  • Dell's Cloud Computing Blog
  • Chirag Mehta's Cloud Computing Blog: Architecture, strategy, design, and innovation ramblings.
  • GigaOm's Infrastructure Blog
  • Markus Klems' Cloudy Times Blog
  • Geva Perry's Thinking Out Cloud: Cloud Computing, Grids, Everything-as-a-Service and more.
  • James Hamilton's Perspective Blog
  • SearchDataCenter.com’s Server Farming Blog: Discusses the latest in server hardware, systems management, Unix-Linux-Wintel operating systems and large distributed computing systems
  • William Vambenepe's blog: IT management in a changing IT world
  • Toon Vanagt's virtualization.com/
  • Data Center Knowledge: News and analysis about data centers, managed hosting and disaster recovery.
  • Nati Shalom's Blog: Discussions about middleware and distributed technologies.
  • Appistry Blogs: At the convergence of Grid Computing, Virtualization and SOA
  • Avastu's Blog: Sustainable Global Clouds - REAL-TIME MARKET ANALYSIS & RESEARCH ON CLOUD COMPUTING, VIRTUALIZATION, GLOBAL SOURCING, EMERGING TRENDS AND BUSINESS STRATEGIES
  • Dan Kusnetzky's & Paula Rooney's Virtually Speaking
  • Phil Wainewright's : Software as Services
  • Grid Gurus: helping realize the value from cluster, distributed and grid computing.
  • Joyent's Blog: Cloud computing vendor.
  • Grid Designer's Blog: Consulting firm specializing exclusively in "extreme" applications and systems.
  • Rob Thorsten's Why Amazon’s RightScale Blog: Primarily talks about Amazon, but there's a lot of good general cloud info too.
  • On-Demand Enterprise: tracks the greater on-demand world beyond.
  • Google Alerts: "Cloud Computing" | "Utility Computing"
  • Jian Zhen's and Michael Mucha's cloudfeed.net: An automated feed of cloud computing and SaaS related stories.
  • On-Demand Enterprise's Cloud Computing Topic: Excellent coverage in the vendor coverage, traditional enterprise data center software. and virtualization space.
  • Avastu Blog: Sustainable Global Clouds: REAL-TIME MARKET ANALYSIS & RESEARCH ON CLOUD COMPUTING, FINANCIAL MARKETS, VIRTUALIZATION, GLOBAL SOURCING, EMERGING TRENDS AND BUSINESS STRATEGIES.
  • TechCrunchIT: dedicated to obsessively profiling products and companies in the Enterprise Technology space. Know any other good blog's that should be on this list?

    Click to read more ...

  • Thursday
    Nov132008

    CloudCamp London 2: private clouds and standardisation

    CloudCamp returned to London yesterday, organised with the help of Skills Matter at the Crypt on the Clarkenwell green. The main topics of this cloud/grid computing community meeting were service-level agreements, connecting private and public clouds and standardisation issues.

    Click to read more ...

    Thursday
    Nov132008

    Plenty of Fish Says Scaling for Free Doesn't Pay

    Plenty of FishCEO Markus Frind, famous nerd hero for making over $10 million a year from Google ads on a free dating site he made and ran all by himself, now sees a problem with the free model:

    The problem with free is that every time you double the size of your database the cost of maintaining the site grows 6 fold. I really underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to service a visit...There is really no money in being free and we have to start experimenting with other models now or we won’t be able to compete in 3 or 4 years.
    As one commenter succinctly put it: the “golden time” of AdSense is over. Time to look at costs. The POF architecture is to run scarily huge tables on single machines. They also buy and maintain their own SAN. So it seems scaling up is what is increasing costs and decreasing profits. I wonder if the economics of cloud storage and cloud architectures might have a more linear cost curve?

    Click to read more ...