Entries in cloud (63)

Wednesday
Oct012008

Joyent - Cloud Computing Built on Accelerators

Kent Langley was kind enough to create a profile template for Joyent, Kent's new employer. Joyent is an infrastructure and development company that has put together a multi-site, multi-million dollar hosting setup for their own applications and for the use of others. Joyent competes with the likes of Amazon and GoGrid in the multi-player cloud computing game and hosts Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App. The template was originally created with web services in mind, not cloud providers, but I think it still works in an odd sort of way. Remember, anyone can fill out a profile template for their system and share their wonderfulness with the world.

Getting to Know You

  • What is the name of your system and where can we find out more about it? Joyent Accelerator Cloud Computing IaaS My name is Kent Langley, Sr. Director, Joyent, Inc. (www.productionscale.com) The Joyent website is located at www.joyent.com The scope of this exercise is the Joyent Accelerator product. http://www.joyent.com/accelerator/
  • What is your system is for? It is essentially a system that provides infrastructure primitives as a service (IaaS) for building cloud computing applications, migrating enterprise data center operations to secure private clouds, or just hosting your blog. There is a page on the site called what scales on Joyent: http://www.joyent.com/accelerator/what-scales-on-joyent/ Java, PHP, Ruby, Erlang, Perl, Python all work beautifully on Joyent. There is no lock-in. Ever. We try to run an open cloud. It's also a "loving cloud" if you ask our CTO. We have some of the largest Rails applications in the world, very high volume ejabberd XMPP infrastructure, exceptionally large Drupal installations, commerce sites in private clouds, .NET with Mono, TomCat, Resin, Glassfish, and much more all running on Accelerators. Joyent Accelerators are the perfect building blocks for almost any PaaS (Platform as a Service) play as well. Of particular note, Java runs exceptionally well on Accelerators because Accelerators are 64bit and you can also do 64 bit Java and have a JVM that could address as much as 32 GiB of RAM! This gives excellent vertical scalability for any running JVM. more below the fold
  • Why did you decide to build this system? There is demand for a high-end but reasonably priced elastic computing infrastructure.
  • How is your project financed? Self-Funded at this time.
  • What is your revenue model? We sell Joyent Accelerators, do Scale Consulting, and some related Services. We also have a growing Parter Channel.
  • How do you market your product? WebSite, Word of Mouth, Blogs, Email Lists, Twitter, Event Sponsorships, Open Source participation, forums, friendfeed, and more...
  • How long have you been working on it? I, Kent Langley, have been working with Joyent for about 2.5 years as a client. I've been with the company as an employee for about 2 months. Joyent has existed formally for about 4 years.
  • How big is your system? Try to give a feel for how much work your system does. We have hundreds and hundreds of servers representing significant compute power across 1000's of cores in multiple locations.
  • Number of monthly page views? Billions and Billions (multiple billion+ page view per month clients)
  • What is your in/out bandwidth usage? That's a secret.
  • How many documents, do you serve? How many images? How much data? Billions per month.
  • How fast are you growing? Fast enough to give me grey hairs.
  • What is your ratio of free to paying users? Very low. Most of our users have paid accounts. We do have some free offerings to help people get started. But, the demand for those services has been high so the lines are a little long.
  • What is your user churn? About Average for the industry we think.
  • How many accounts have been active in the past month? Thousands.

    How is your system architected?

  • What is the architecture of your system? Talk about how your system works in as much detail as you feel comfortable with. Our technology stack is predicated on something we call a Pod. We have several pods and plans to add more. From the top to bottom you'd find. BigIP F5 Application Switches Force 10 (1GB and 10GB switching) Custom Dell Hardware with some secret sauce Tier1 Hosting Providers Essentially a custom Solaris Nevada based OS Core w/ a Pkgsrc install system
  • What particular design/architecture/implementation challenges does your system have? Automation. Automation. Automation. Self-Service.
  • What did you do to meet these challenges? We have an amazing team of Systems Developers that work very hard to improve our ability to grow and manage systems each day. We have some great updates on our Roadmap coming up that should be very exciting for existing and potential customers.
  • How does your system evolve to meet new scaling challenges? Our system is by it's nature evolutionary. As technology grows and changes, we grow with it. A recent example is when a client needed a private cloud computing environment to achieve PCI compliance in a cloud environment. So, we worked with the client to create this. While it is in production for two clients already we consider this a beta product. But you should expect to see it as a formal offering soon. This is an example of a way we have evolved our systems to respond to the changing cloud computing market place.
  • Do you use any particularly cool technologies are algorithms? ZFS, BigIP, DTrace, OpenSolaris Nevada, and an in-house custom provisioning system we call MCP (hat's off to Tron)
  • What do you do that is unique and different that people could best learn from? We know our approach to things is a little different. But, we think that helps us inhabit a space that is different enough from other vendors in the Cloud Computing space that we offer a significant value proposition to a large cross-section of the IT industry. From the lone developer with a great idea that comes in and picks up a $199 per year 1/4 GiB Acclerator to a deployment that has literally 100's of Acclerators running the largest Rails applications on the planet. We are able to take good care of them both.
  • What lessons have you learned? If at first you don't succeed. Try, try again. Get over your mistakes and move on.
  • Why have you succeeded? We care. Our clients care. That's a nice fit.
  • How are you thinking of changing your architecture in the future? MORE secret sauce... But seriously, we have some great additions coming up soon. I'll be in touch.

    How is your team setup?

  • How many people are in your team? Joyent has a small employee to client ratio. But, that's because we do what we do well. We are divided into several of the normal divisions you might expect like client support, marketing, sales, development, operations, and the business units.
  • Where are they located? Our corporate office is in Sausalito, CA. We have a development team in Seattle. We have a support organization that follows the sun and spans the globe. IM is a big deal at Joyent.
  • Is there anything that you would do different or that you have found surprising? I'd say managing expectations is the most challenging thing. I think that's where we stand to improve the most and where most of the surprises come from.

    What infrastructure do you use?

  • Which languages do you use to develop your system? Ruby
  • How many servers do you have? Not as many as Google!
  • How is functionality allocated to the servers?
  • How are the servers provisioned? We have a custom cloud provisioning system called MCP.
  • What operating systems do you use? Customized Sun Solaris Nevada
  • Which web server do you use? Apache and Nginx are the work horses in Joyent Accelerators
  • Which database do you use? MySQL and PostGRES are included w/ every Accelerator. Oracle works of you bring your own licenses. CouchDB works. We are certifying more all the time.
  • Do you use a reverse proxy? Well, our clients often use Nginx and now that there is a viable port of Varnish to OpenSolaris we are seeing more of that. Some of our clients use Squid as well. Most popular reverse proxy software will run find on our setup.
  • Do you collocate, use a grid service, use a hosting service, etc? We are that.
  • What is your storage strategy? DAS/SAN/NAS/SCSI/SATA/etc/other? We provide NFS to our clients for $0.15/GiB. 1 GiB = 1024 MB.
  • How much capacity do you have? Many Terabytes
  • How do you grow capacity? Add hardware
  • Do you use a storage service? We are a storage service.
  • Do you use storage virtualization? Not really. It's been and continues to be tested. But, you can't beat the real thing still in many cases.
  • How do you handle session management? Our clients do this depending on their development platform of choice at the application layer. Also, we can of course use our BigIP load balancing infrastructure to help out with that also.
  • How is your database architected? Master/slave? Shard? Other? All of the above, client by client. We know that Master-Master MySQL, Master-Slave MySQL, Oracle Clusters, MySQL Clusters, PostGRES, etc. They all work fine.
  • How do you handle load balancing? We have F5 BigIP's and we do what we call a managed load balancing service. For example, if you have two application servers, you need to load balance. Just ask us to set you up a VIP and we'll add the nodes you specify for a cost per node. All the pricing information is here. http://www.joyent.com/accelerator/pricing/
  • Which web framework/AJAX Library do you use? We have clients that use just about everything you can think of.
  • Which real-time messaging frame works do you use? We have very large clients running ejabberd. Erlang works great on our systems.
  • Which distributed job management system do you use? This is client by client. We do not offer this out of the box.
  • How do you handle ad serving? This is up to the client. We've seen just about all of them.
  • What is your object and content caching strategy? We usually recommend memcached, it's pre-installed and ready to turn on.
  • What is your client side caching strategy? I'd say most of our clients use cookies.

    How do you handle customer support?

    We have a customer support team that is dedicated to helping our customers. Our services pretty much assume that you will have some degree of ability with building and deploying systems. However, if you don't, we have standard, extended plan, and partners that can all be combined in various ways to help our clients. Our support follows the sun around the world.

    How is your data center setup?

  • How many data centers do you run in? Several. :) Currently only domestic on both coasts and elsewhere.
  • How is your system deployed in data centers? In-House Automated provisioning systems
  • Are your data centers active/active, active/passive? Everything is always on. Our clients often co-locate in multiple locations so that they can have solid DR scenarios to keep investors happen and recover quickly should a truck hit a telephone pole or something.
  • How do you handle syncing between data centers and fail over and load balancing? This is a complex topic and can be very simple of very complex. It's a bit out of scope for this document.
  • Which DNS service do you use? We run our own based on PowerDNS
  • Which switches do you use? Force10
  • Which email system do you use? Mostly Postfix
  • How do you handle spam? Filter at a variety of levels
  • How do you backup and restore your system? High level snap shots, clients are responsible for their own data primarily. However, we have ways to help them.
  • How are software and hardware upgrades rolled out? We do quarterly releases of key software and our Accelerators. Sometimes we get a little behind but try to roll with it. You get root on your Accelerator so you are not dependent on the Joyent release cycle at all.
  • How do you handle major changes in database schemas on upgrades? This is up to the clients and highly platform and applications specific.
  • What is your fault tolerance and business continuity plan? Lots of redundancy.
  • Do you have a separate operations team managing your website? No. We do it ourselves.
  • Do you use a content delivery network? If so, which one and what for? Yes. We are currently partnered with Limelight.
  • How much do you pay monthly for your setup? Accelerator plans range from $199 per year to $4000 per month. Significant discounts can be had if you pay ahead. But, it's very important to note that we do not require or even want contracts. Some companies try to force us into contracts and if you just MUST lock yourself in for years, we'll tie you down. But, we don't recommend it at all. In essence pay for what you need when you need it on a month to month granularity. http://www.joyent.com/accelerator/pricing/

    SUMMARY

    The Joyent Accelerator is an extremely flexible tool for building and deploying all manner of infrastructure. If you have questions, please just contact us at sales@joyent.com. Email or at an address is the best way to reach us usually.

    Related Articles

  • Scaling in the Cloud with Joyent's Jason Hoffman (podcast)
  • Amazon Web Services or Joyent Accelerators: Reprise by Jason Hoffman

    Click to read more ...

  • Thursday
    Sep252008

    GridGain: One Compute Grid, Many Data Grids

    GridGain was kind enough to present at the September 17th instance of the Silicon Valley Cloud Computing Group. I've been curious about GridGain so I was glad to see them there. In short GridGain is: an open source computational grid framework that enables Java developers to improve general performance of processing intensive applications by splitting and parallelizing the workload. GridGain can also be thought of as a set of middleware primitives for building applications. GridGain's peer group of competitors includes GigaSpaces, Terracotta, Coherence, and Hadoop. The speaker for GridGain was the President and Founder, Nikita Ivanov. He has a very pleasant down-to-earth way about him that contrasts nicely with a field given to religious discussions of complex taxomic definitions. Nikita first talked about cloud computing in general. He feels Java is the perfect gateway for cloud computing. Which is good because GridGain only works with Java. The Java centricity of GridGain may be an immediate deal killer or a non-issue for a Java shop. Being so close to the language does offer a lot of power, but it sure sucks in a multi-language environment. Nikita gave a few definitions which are key to understanding where GridGain stands in the grid matrix:

  • Compute Grids: parallel execution.
  • Data Grids: parallel data storage.
  • Grid Computing: Compute Grids + Data Grids
  • Cloud Computing: datacenter + API. The key is automation via programmability as a way to deploy applications. The advantage is a unified programming model. Build an application on one node and you can run on many nodes without code change. Moving peak loads to the cloud can give you a 10x-100x cost reduction. Cloud computing poses a number of challenges: deployment, data sharing, load balancing, failover, discovery (nodes, availability), provisioning (add, remove), management, monitoring, development process, debugging, inter and external clouds (syncing data, syncing code, failover jobs). Nakita talked some about these issues, but he didn't go in-depth. But he showed a good understanding of the issues involved so I would be inclined to think GridGain handles them well. The cloud computing section is new to the standard GridGain presentation. GridGain is moving their grid into the cloud with new features like a cloud management layer available in Q1 2009. This move competes with GigaSpaces early move to the cloud with their RightScale partnership. It's a good move. Like peanut butter and chocolate, grids and clouds go better together. Grids have been under utilized largely because of infrastructure issues. A cloud platform makes it is to affordably grow and manage grids, so we might see an uptick in grid adoption as clouds and grids hookup. GridGain positions themselves as a developer centric framework according to their analysis of cloud computing in Java:
  • Heavy UI oriented. These types of applications or framework usually provide UI-based consoles, management applications, plugins, etc that provide the only way to manage resources on the cloud such as starting and stopping the image, etc. The key characteristic of this approach is that it requires a substantial user input and human interaction and thus they tend to be less dynamic and less on-demand. Good examples would be RightScale, GigaSpaces, ElasticGrid.
  • Heavy framework oriented. This approach strongly emphasizes dynamism of resource management on the cloud. The key characteristic of this approach is that it requires no human interaction and all resource management can be done programmatically by the grid/cloud middleware - and thus it is more dynamic, automated and true on-demand. Google App Engine (for Python), GridGain would be good examples. I think there's a misunderstanding of RightScale here. The UI is to configure the automated system, not manage the system. The automated system monitors and responds to events without human interaction. Won't their automated cloud layer have to do something similar? To bootstrap any complex system out of the mud of complexity a helpful UI is needed. The framework approach of GridGain's infrastructure is developer friendly, but that won't fly for external management within the cloud.

    GridGain's True Nature: One Compute Grid, Many Data Grids

    With these definitions in place we can now learn the secret of Grid Gain: One Compute Grid, Many Data Grids. Ding! Ding! Ding! Once I understood this I understood Grid Gain's niche. GridGain has focussed on making it dead simple to distribute work across a compute grid. It's a job management mechanism. GridGain doesn't include a data grid. It will work against any data grid. For some reason this fact was something I'd never pulled out of the noise before. And when I would read Nakita's blog with all the nifty little code samples I never really appreciated what was happening. Yes, I'm just that dumb, but I also think Grid Gain should expose the magic of what's going on behind the scenes more rather than push the simple 30-second-lets-write-code-live style demo. Seeing the mechanics would make it easier to build a mental model of the value being added by GridGain.

    Transparent and Low Configuration Implementation of Key Features

    A compute grid is just a bunch of CPUs calculations/jobs/work can be run on. As a developer problem are broken up into smaller tasks and spread across all your nodes so the result is calculated faster because it is happening in parallel. GridGain enthusiastically supports the MapReduce model of computation. When deploying a grid a few key problems come up:
  • How do you get your code to all nodes? Not just the first time, but every time a JAR file changes how distributed across all nodes?
  • How do all the other nodes find each other when they come up? Clearly for work to be sent to nodes someone must know about them.
  • How are jobs distributed to the nodes? Somehow jobs must be sent to a node, the calculations made, and the results assembled.
  • How are failures handled? Somehow when a node goes down and new nodes come on-line work must be rescheduled.
  • How does each node get the data it needs to do its work? Scalable computation without scalable data doesn't work for most problems. Much of the drama is lost with GridGain because most of these capabilities almost are implemented almost transparently. Discovery happens automatically. When nodes come up they communicate with each other and transparently form a grid. You don't see this, it just happens. In fact, this was one of GridGain's issues when porting to the cloud. They used multicast for discovery and Amazon doesn't support multicast. So they had to use another messaging service, which GridGain supports doing out-of-the box, and are now working on their unicast own version of the discovery service. Deploying new code is always a frustrating problem. Over the same transparently formed grid, code updates are transparently auto deployed on the grid. Again, this is one of those things you see happen from Eclipse and it loses most of the impact. It just looks like how it's supposed to work, but rarely does. With GridGain you do a build and your code changes are automatically sent through to each node in the grid. Very nice. To mark a method a gridified an annotation (or an API call) is used:
    @Gridify(taskClass = GridifyHelloWorldTask.class, timeout = 3000)
    public static int sayIt(String phrase) {
        // Simply print out the argument.
        System.out.println(">>> Printing '" + phrase + "' on this node from grid-enabled method.");
        return phrase.length();
    }
    
    The task class is responsible for splitting method execution into sub-jobs. For a full example go here. The @Gridify annotation uses AOP (aspect-oriented programming) to automatically "gridify" the method. I assume this registers the method with the job scheduling system. When the application comes up and triggers execution the method is then scheduled through the job scheduling system and allocated to nodes. Again, you don't see this and they really don't talk enough about how this part works. Notice how so much complexity is nicely hidden by GridGain with very little configuration on the developer's part. There aren't a billion different XML files where every single part of the system has to be defined ahead of time. The dynamic transparent nature of the core features make it simple to use.

    Integrating with the Data Grid

    We haven't talked about data at all. If you are just concerned with a program like a Monte Carlo simulation then the compute grid is all you need. But most calculations require data. Where does your massive compute grid pull the data from? That's where the data grid comes in. A data grid is the controlled sharing and management of large amounts of distributed data. GridGain leaves the data grid up to other software by integrating with packages like, JBoss Cache, Oracle Coherence, and GigaSpaces. Remember One Compute grid, Many Data Grids. GridGain accesses the data grid through an API so you can plug in any data grid you want to support with a little custom code. Google and Hadoop use a distributed file system (DFS) as their data grid. This makes sense. When you need to feed lots of CPUs the data can't come from a centralized store. The data must be parallelized and that's what a DFS does. A DFS splats data across a lot of spindles so it can be pulled relatively quickly by lots of CPUs in parallel. Other products like Coherence and GigaSpaces store data in an in-memory data grid instead of a filesytem. Serving data from memory is faster, but you are limited by the amount of memory you have. If you have a petabyte of data your choice is clear, but if your problem is a bit smaller than maybe an in-memory solution would work. The closer data is to the business logic the better performance will be. GridGain controls job execution while the data grid is responsible for the availability and integrity of the data. GridGain doesn't care what data grid you use, but your choice has implications for performance. A compute grid and an in-memory data grid in the same cloud will smoke configurations where the data grid comes from disk or is located outside the cloud.

    GridGain is Linearly Scalable for a Pure CPU Benchmark

    The good folks at GridDynamics are doing some serious cloud testing of different products and different clouds. They did a test Scalability Benchmark of Monte Carlo Simulation on Amazon EC2 with GridGain Software that found GridGain was linearly scalable to 512 nodes in Amazon's EC2. A Monte Carlo simulation is a CPU test only, it does not use a data grid. A data grid based test would be more useful to me as everything changes once large amounts of data start flying around, but it does indicate the core of GridGain is quite scalable.

    Wrapping Up

    Grid products like Coherence and GigaSpaces include both compute grid and data grid features. Why choose a compute grid only system like GridGain when other products include both capabilities? GridGain might say they win business on the quality of their compute grid, excellent support and documentation, and the ability to cleanly integrate into almost any existing ecosystem through their well thought out API abstraction layer and their out-of-the-box support for almost every important Java framework. Others may counter performance is far better when the business logic and the job management are integrated. All interesting issues to tradeoff in your own decision making process. GridGain is free as their business model is based on providing support and consultation. A non-starter for many is the Java-only restriction. What is unique about GridGain is how easy and transparent they made it to use and deploy. That's some thoughtful engineering.

    Related Articles

  • Gridify Blog
  • Ten Useful Gridgain How-To Tips
  • 10 Reasons to Use GridGain
  • What is Grid Gain?
  • Developers Productivity: Unsung Hero of GridGain
  • GridGain vs Hadoop
  • Cameron Purdy: Defining a Data Grid
  • Compute Grids vs. Data Grids

    Click to read more ...

  • Monday
    Sep222008

    Paper: On Delivering Embarrassingly Distributed Cloud Services

    How do we scale datacenters? Should we build a few mammoth million machine datacenters or many smaller micro datacenters? Intuitively we usually go with a bigger is better economies of scale type argument, but it may not be so. What works for Walmart may not work for White Box World. Mega datacenters may actually exhibit diseconomies of scale. It may be better to run applications over many distributed micro datacenters instead of one large one. This paper by Ken Church, Albert Greenberg, and James Hamilton, all from Microsoft, takes a look at the different issues and concludes:

    Putting it all together, the micro model offers a design point with attractive performance, reliability, scale and cost. Given how much the industry is currently investing in the mega model, the industry would do well to consider the micro alternative.

    Related Articles

  • Embarrasingly Distributed Cloud Services by James Hamilton
  • Diseconomies of Scale by James Hamilton.
  • Architecture for Modular Datacenters by James Hamilton.
  • Enterprise Data Center Design and Methodology by Rob Snevely. Enterprise Data Center Design and Methodology is a practical guide to designing a data center from inception through construction. The fundamental design principles take a simple, flexible, and modular approach based on accurate, real-world requirements and capacities. This approach contradicts the conventional (but totally inadequate) method of using square footage to determine basic capacities like power and cooling requirements.

    Click to read more ...

  • Sunday
    Aug172008

    Wuala - P2P Online Storage Cloud

    How do you design a reliable distributed file system when the expected availability of the individual nodes are only ~1/5? That is the case for P2P systems. Dominik Grolimund, the founder of a Swiss startup Caleido will show you how! They have launched Wuala, the social online storage service which scales as new nodes join the P2P network. The goal of Wua.la is to provide distributed online storage that is:

    • large
    • scalable
    • reliable
    • secure
    by harnessing the idle resources of participating computers. This challenge is an old dream of computer science. In fact as Andrew Tanenbaum wrote in 1995: "The design of a world-wide, fully transparent distributed filesystem fot simultaneous use by millions of mobile and frequently disconnected users is left as an exercise for the reader" After three years of research and development at at ETH Zurich, the Swiss Federal Institute of Technology on a distributed storage system, Caleido is ready to unveil the result: Wuala. Wuala is a new way of storing, sharing, and publishing files on the internet. It enables its users to trade parts of their local storage for online storage and it allows us to provide a better service for free. In this Google Tech Talk, Dominik will explain what Wuala is and how it works, and he will also show a demo. The availability problem is solved by redundancy (just like in Google File System). However simple replication techniques would result in too much overhead because of the low availability of the nodes. Instead Wuala employs erasure coding and splits the data into small pieces. Optimal erasure codes produce n/r fragments where any n fragments is sufficient to recover the original message. These pieces are then distributed in the P2P network providing good availability at a reasonable overhead. The P2P network consists of client, storage and routing nodes. The Wuala architecture uses a mix of regular and random graphs to optimize routing. Dominik also explains how Wuala architecture is designed to provide security and fairness. Wuala employs the 128 bit AES algorithm for encryption and the 2048 bit RSA algorithm for authentication. If you're interested in how Wuala manages encryption, have a look at their publication on Cryptree. They have also implemented distributed reputation audit and maintenance functions. Check out the Tech Talk! It is worth the time!

    Click to read more ...

    Monday
    Jul212008

    Eucalyptus - Build Your Own Private EC2 Cloud

    Update: InfoQ links to a few excellent Eucalyptus updates: Velocity Conference Video by Rich Wolski and a Visualization.com interview Rich Wolski on Eucalyptus: Open Source Cloud Computing. Eucalyptus is generating some excitement on the Cloud Computing group as a potential vendor neutral EC2 compatible cloud platform. Two reasons why Eucalyptus is potentially important: private clouds and cloud portability: Private clouds. Let's say you want a cloud like infrastructure for architectural purposes but you want it to run on your own hardware in your own secure environment. How would you do this today? Hm.... Cloud portability. With the number of cloud offerings increasing how can you maintain some level of vendor neutrality among this "swarm" of different options? Portability is a key capability for cloud customers as the only real power customers have is in where they take their business and the only way you can change suppliers is if there's a ready market of fungible services. And the only way their can be a market is if there's a high degree of standardization. What should you standardize on? The options are usually to form a great committee and take many years to spec out something that doesn't exist, nobody will build, and will never really work. Or have each application create a high enough layer interface that portability is potentially difficult, but possible. Or you can take a popular existing API, make it the general API, and everyone else is accommodated using an adapter layer and the necessary special glue to take advantage of value add features for each cloud. With great foresight Eucalyptus has chosen to create a cloud platform based on Amazon's EC2. As this is the most successful cloud platform it makes a lot of sense to use it as a model. We see something similar with the attempts to port Google AppEngine to EC2 thus making GAE a standard framework for web apps. So developers would see GAE on top of EC2. A lot of code would be portable between clouds using this approach. Even better would be to add ideas in from RightScale, 3Tera, and Mosso to get a higher level view of the cloud, but that's getting ahead of the game. Just what is Eucalyptus? From their website: Overview ¶ Elastic Computing, Utility Computing, and Cloud Computing are (possibly synonymous) terms referring to a popular SLA-based computing paradigm that allows users to "rent" Internet-accessible computing capacity on a for-fee basis. While a number of commercial enterprises currently offer Elastic/Utility/Cloud hosting services and several proprietary software systems exist for deploying and maintaining a computing Cloud, standards-based open-source systems have been few and far between. EUCALYPTUS -- Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems -- is an open-source software infrastructure for implementing Elastic/Utility/Cloud computing using computing clusters and/or workstation farms. The current interface to EUCALYPTUS is interface-compatible with Amazon.com's EC2 (arguably the most commercially successful Cloud computing service), but the infrastructure is designed to be modified and extended so that multiple client-side interfaces can be supported. In addition, EUCALYPTUS is implemented using commonly-available Linux tools and basic web service technology making it easy to install and maintain. Overall, the goal of the EUCALYPTUS project is to foster community research and development of Elastic/Utility/Cloud service implementation technologies, resource allocation strategies, service level agreement (SLA) mechanisms and policies, and usage models. The current release is version 1.0 and it includes the following features: * Interface compatibility with EC2 * Simple installation and deployment using Rocks cluster-management tools * Simple set of extensible cloud allocation policies * Overlay functionality requiring no modification to the target Linux environment * Basic "Cloud Administrator" tools for system management and user accounting * The ability to configure multiple clusters, each with private internal network addresses, into a single Cloud. The initial version of EUCALYPTUS requires Xen to be installed on all nodes that can be allocated, but no modifications to the "dom0" installation or to the hypervisor itself. For more discussion see:

  • James Urquhart's excellent blog The Wisdom of Clouds.
  • Simon Wardley's post Open sourced EC2 .... not by Amazon.
  • Google Cloud Computing Group.
  • Eucalyptus and You by James Urquhart
  • Open Virtual Machine Format on LayerBoom. The Open Virtual Machine Format, or OVF is a proposed universal format that aims to create a secure, extensible method of describing and packaging virtual containers.

    Click to read more ...

  • Sunday
    Jul202008

    The clouds are coming

    A report from the CloudCamp conference on cloud computing, held in London in July 2008.

    Click to read more ...

    Thursday
    Jul102008

    Can cloud computing smite down evil zombie botnet armies?

    In the more cool stuff I've never heard of before department is something called Self Cleansing Intrusion Tolerance (SCIT). Botnets are created when vulnerable computers live long enough to become infected with the will to do the evil bidding of their evil masters. Security is almost always about removing vulnerabilities (a process which to outside observers often looks like a dog chasing its tail). SCIT takes a different approach, it works on the availability angle. Something I never thought of before, but which makes a great deal of sense once I thought about it. With SCIT you stop and restart VM instances every minute (or whatever depending in your desired window vulnerability).... This short exposure window means worms and viri do not have long enough to fully infect a machine and carry out a coordinated attack. A machine is up for a while. Does work. And then is torn down again only to be reborn as a clean VM with no possibility of infection (unless of course the VM mechanisms become infected). It's like curing cancer by constantly moving your consciousness to new blemish free bodies. Hmmm... SCIT is really a genius approach to scalable (I have to work in scalability somewhere) security and and fits perfectly with cloud computing and swarm (cloud of clouds) computing. Clouds provide plenty of VMs so there is a constant ready supply of new hosts. From a software design perspective EC2 has been training us to expect failures and build Crash Only Software. We've gone stateless where we can so load balancing to a new VM is not problem. Where we can't go stateless we use work queues and clusters so again, reincarnating to new VMs is not a problem. So purposefully restarting VMs to starve zombie networks was born for cloud computing. If a wider move could be made to cloud backed thin clients the internet might be a safer place to live, play, and work. Imagine being free(er) from spam blasts and DDOS attacks. Oh what a wonderful world it would be...

    Click to read more ...

    Wednesday
    Jun042008

    LinkedIn Architecture

    LinkedIn is the largest professional networking site in the world. LinkedIn employees presented two sessions about their server architecture at JavaOne 2008. This post contains a summary of these presentations. Key topics include:

    • Up-to-date statistics about the LinkedIn user base and activity level
    • The evolution of the LinkedIn architecture, from 2003 to 2008
    • "The Cloud", the specialized server that maintains the LinkedIn network graph
    • Their communication architecture

    Click to read more ...

    Tuesday
    May272008

    How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale

    Update 3: ReadWriteWeb says Google App Engine Announces New Pricing Plans, APIs, Open Access. Pricing is specified but I'm not sure what to make of it yet. An image manipulation library is added (thus the need to pay for more CPU :-) and memcached support has been added. Memcached will help resolve the can't write for every read problem that pops up when keeping counters. Update 2: onGWT.com threw a GAE load party and a lot of people came. The results at Load test : Google App Engine = 1, Community = 0. GAE handled a peak of 35 requests/second and a sustained 10 requests/second. Some think performance was good, others not so good. My GMT watch broke and I was late to arrive. Maybe next time. Also added a few new design rules from the post. Update: Added a few new rules gleaned from the GAE Meetup: Design By Explicit Cost Model and Puts are Precious. How do you structure your database using a distributed hash table like BigTable? The answer isn't what you might expect. If you were thinking of translating relational models directly to BigTable then think again. The best way to implement joins with BigTable is: don't. You--pause for dramatic effect--duplicate data instead of normalize it. *shudder* Flickr anticipated this design in their architecture when they chose to duplicate comments in both the commentor and the commentee user shards rather than create a separate comment relation. I don't know how that decision was made, but it must have gone against every fiber in their relational bones... But Flickr’s reasoning was genius. To scale you need to partition. User data must spread across the shards. So where do comments belong in a scalable architecture? From one world view comments logically belong to a relation binding comments and users together. But if your unit of scalability is the user shard there is no separate relation space. So you go against all your training and decide to duplicate the comments. Nerd heroism at its best. Let inductive rules derived from observation guide you rather than deductions from arbitrarily chosen first principles. Very Enlightenment era thinking. Voltaire would be proud. In a relational world duplication is removed in order to prevent update anomalies. Error prevention is the driving force in relational modeling. Normalization is a kind of ethical system for data. What happens, for example, if a comment changes? Both copies of the comment must be updated. That leads to errors because who can remember where all the data is stored? A severe ethical violation may happen. Go directly to relational jail :-) BigTable data ethics are more Mardi Gras than dinner with the in-laws. Data just wants to have fun. BigTable won’t stop you from hurting yourself. And to get the best results you may have to engage in some conventionally risky behaviors. But if those are the glass bead necklaces you have to give for a peak at scalability, why not take a walk on the wild side? For a more modern post-relational discussion of data ethics I’m using as my primary source a thread of conversations from JA Robson, Ben the Indefatigable, Michael Brunton-Spall, and especially Brett Morgan. According to our new Voltaire, Locke, Bacon, and Newton, here’s what it takes to act ethically in a BigTable world:
  • Don’t bother with BigTable unless your goal is to create a web site that scales to millions of users. The techniques for building scalable read-mostly web applications are difficult and require a radical mindset change. Standard relational techniques work very well until you scale to huge numbers of users. It is at that point you need to break the rules and do something counter-intuitively different. More of the same will not work. If you don’t plan to get to that point it may not be worth the effort to change. BigTable is targeted at building web applications, It's nature makes it a poor match for OLAP, data warehousing, data mining, and other applications performing complex data manipulations.
  • Assume slower random data access rather than fast sequential access. Every get of an entity could be from a different disk block on a different machine in a cluster. Calculating, for example, the average over a column in SQL can be efficient because data is stored together on disk. In BigTable data can be anywhere so iterating over every value in a column is expensive. Each read is potentially a random block from anywhere which means the average retrieval time can be relatively high. The implication is to use BigTable you must adopt some unfamiliar and unintuitive strategies in order to deal with such a very different performance profile. Using relational database we are used to writing applications against fast highly performant databases. With BigTable you have to become familiar with the rules for developing against a slower but more scalable database. Neither approach is better for all purposes, but BigTable has the edge for high scalability.
  • Group data for concurrent reads. Given the high cost of reading data from BigTable your application will not scale if every page requires a large number of reads. The solution: denormalize. Store data in the same entity based on what data needs to be read concurrently. Relational modeling groups data together based on the “minimize problems” rule. BigTable’s new rule is “maximize concurrent reads” which implies denormalization. Store entities so they can be read in one access rather than performing a join requiring multiple reads. Instead of storing attributes in separate entities in order to remove duplication, duplicate the attributes and store them where they need to be used. Following this rule minimizes the number of reads required to return an entity.
  • Disk and CPU are cheap so stop worrying about them and scale. A criticism of denormalization is storing duplicate data wastes disk space. Google’s architecture trades disk space for better performance. Disk is (relatively) cheap, so don’t fight it. On the CPU front a data center’s worth of CPU is at your service. As long as you structure your application in the way GAE forces you to, your application can scale as large as it needs to simply by running on more machines. All scalability bottlenecks have been removed.
  • Structure data around how it will be used. Trade SQL sets for application based entities. Queries are slow so the closer data is to the format it is to be used the faster pages will render. It’s like the database model becomes the model previously used at the caching layer. Complete entities tend to be cached, not low level detail rows. That’s what BigTable models should look like because that’s how concurrent reads are maximized. This isn’t the same as an object oriented database because the behavior is provided by applications, behavior is not bound to the entity so multiple applications can read the same entities yet implement very different behaviors.
  • Compute attributes at write time. Since looping over large columns of data is inefficient with BigTable the idea is to calculate values at write time instead of read time. For example, instead of calculating an average by reading an entire column at read time, track the total number and the total value at write time so the average can be calculated with one read on page display. Programmer effort is made up front at write time to minimize the work needed at read time. Preventing applications from iterating over huge data is key for making applications scale. Given the limitations of GAE transactions and quotas, GAE may not be appropriate for business applications that need exact summary statistics. Warning: if the summary stat is written on every read request then this approach will not scale as writes don't scale.
  • Create large entities with optional fields. Normalization creates lots of small entities. Instead, create larger entities with optional parts so you can do one read and then determine what’s present at run time. This shifts work from the database to the CPU while minimizing the number joins.
  • Define schemas in models. Denormalization requires user developed code to properly keep data consistent across multiple entities. The database won’t do it for you anymore. Schemas are really defined in code because it’s only code that can track all the relationships and maintain correctness. All database access must go through the models or otherwise the much feared inconsistency problems will result.
  • Hide updates using Ajax. Updates are slow so big bang updates of many entities will appear slow to users . Instead, use Ajax to update the database in little increments. As a user enters form data update the database so the update cost is amortized over many calls rather than one big call at the end. The result is a good user experience and a more scalable app.
  • Puts are Precious. Updating entities in large batches, say even 200 at a time, isn't part of the BigTable model. Entity attributes are automatically and synchronously indexed on writes. Indexing is an expensive operation that accumulates a lot of CPU time so the number updates that can be performed in one query is quite limited. The work around is to perform updates in smaller batches driven by an external CPU. Even when GAE provides the ability run batches within GAE the programming model for writes needs to be accounted for in a design.
  • Design By Explicit Cost Model. If you are going to be charged for an operation GAE wants you to explicitly ask for it. This is why some automatic navigation between objects isn't provided because that will force an explicit query to be written. Writing an explicit query is a sort of EULA for being charged. Click OK in the form of a query and you've indicated that you are prepared to pay for a database operation.
  • Place a many-to-many relation in the entity with the fewest number of elements. One way to create a many-to-many relationship is to have a list property that contains keys to the other related entities. A Company entity, for example, could contain a list of keys to Contact entities or a Contact entity could contain a list of keys to Company entities. Since it's likely a Contact is associated with fewer Companies the list should be contained in the Contact. The reasoning is maintaining large lists is relatively inefficient so you want to minimize the number of items in a list as much as possible.
  • Avoid unbounded queries. Large queries don't scale. Consider showing only the most recent 10 or so values from an attribute.
  • Avoid contention on datastore entities. If every request to your app reads or writes a particular entity, latency will increase as your traffic goes up because reads and writes on a given entity are sequential. One example construct you should avoid at all costs is the global counter, i.e. an entity that keeps track of a count and is updated or read on every request.
  • Avoid large entity groups. Any two entities that share a common ancestor belong to the same entity group. All writes to an entity group are sequential, so large entity groups can bog down popular apps quickly if there are a lot of writes to that group. Instead, use small, localized groups in your design.
  • Shard counters. Increment one of N counters and sum those N counters on the read side. This avoids the dreaded write bottleneck. See Efficient Global Counters by App Engine Fan for more details. An excellent example showing some of these principles in action can be found in this GQL thread. Take this nicely normalized schema:
    Customer: 
     - Name 
     - Country 
    Product: 
    - Code 
    - Name 
    - Description 
    Purchases: 
    - Reference to Product Entity 
    - Reference to Customer Entity 
    - Date of order 
    
    Anyone from a relational background would look at this schema and give it a big thumbs up. With a little effort we can imagine the original physical purchase order that has now been normalized into three different tables. To recreate the original purchase order a join on purchases, produce and customer is needed. Read speed is not optimized, safety is optimized. Here’s what the same schema looks like optimized for reading:
    Purchase: 
    - Customer Name 
    - Customer Country 
    - Product Code 
    - Product Name 
    - Purchase Order Number 
    - Date Of Order
    
    The three original tables have been folded into one entity. Now a purchase order can be read in one get operation. No join necessary. Notice how the entity looks more like an original purchase order. It is also what would probably be cached and is what our model would probably look like. But what if you want to update a product name or a customer name? Those attributes are duplicated in all entities. Here’s where the protection offered by the relational model comes in. Only one entity needs updating in a normalized model. In BigTable you have to remember everywhere a customer name and product name and change every instance to new values. It’s not a simple, safe, or reliable approach. But it does optimize for read speed and scalability. For an application with a high proportion of updates to reads this approach wouldn’t make sense. But on the web reads usually dominate. How often do you really change a customer name or a product name? Seldom. How often do you read them? All the time. Designing to scale for reads and taking the pain on writes takes some getting used to. It’s a massive change to standard relational tactics. But this is what it takes to scale web applications, even if it feels a little strange at first.

    Related Articles

  • ER-Modeling with Google App Engine (updated)
  • Tips on writing scalable apps

    Click to read more ...

  • Monday
    Apr212008

    Google App Engine - what about existing applications?

    Recently, Google announced Google App Engine, another announcement in the rapidly growing world of cloud computing. This brings up some very serious questions: 1. If we want to take advantage of one of the clouds, are we doomed to be locked-in for life? 2. Must we re-write our existing applications to use the cloud? 3. Do we need to learn a brand new technology or language for the cloud? This post presents a pattern that will enable us to abstract our application code from the underlying cloud provider infrastructure. This will enable us to easily migrate our EXISTING applications to cloud based environment thus avoiding the need for a complete re-write.

    Click to read more ...