Machine VM + Cloud API - Rewriting the Cloud from Scratch
Write a little "Hello World" program these days and it runs inside a bewildering Russian Doll of nested environments, each layer adding its own special performance and complexity tax. First, a language executes in its own environment of data structure libraries, memory management, and so on. That, more often than not, will run inside a language VM like the JVM, CLR, or V8. The language VM will in-turn run inside a process that runs inside an OS. An application will run in one or more threads inside a process. And the whole thing will run inside a machine sharing VM layer like Xen. And across all of that are frameworks for monitoring, elasticity, storage, and so on. That's a lot of overhead for a such a little program.
What if we could remove all these taxes and run directly on the new bare metal, which some consider to be a combination of Machine VM + Cloud API? That's exactly what a system called Mirage, described in the paper Turning down the LAMP: Software Specialisation for the Cloud, sets out to do by treating the cloud virtual hardware as a compiler target, and converting high-level language source code directly into kernels that run on it.
From the paper:
Frameworks which currently use (for example) fork(2) on a host to spawn processes would benefit from using cloud management APIs to request resources and eliminate the distinction between cores and hosts...We instead view the cloud as a stable hardware platform, and present a programming framework which permits applications to be constructed to run directly on top of it without intervening software layers. Our prototype (dubbed Mirage) is unashamedly academic; it extends the Objective Caml language with storage extensions and a custom run-time to emit binaries that execute as a guest operating system under Xen. Mirage applications exhibit significant performance speedups for I/O and memory handling versus the same code running under Linux/Xen.
It's a fascinating idea. Operating systems have historically provided two services: hardware access and sharing of scarce resources. In the cloud, are these really necessary anymore? In the cloud you can't install special hardware so there's really no need to pretend that device drivers are important. Resources are now no longer scarce in the sense that they are acquired elastically via APIs. And with the move to service oriented architectures and the VM already sharing hardware, there's really no need to keep the ghost of a time sharing OS around either.
You are essentially developing inside a language environment using APIs and running that directly on the cloud. This model may have been strange at one time, but with the advent of PaaS products like Google App Engine, Salesforce, and Heroku, the idea of developing inside a language environment, using APIs, and simply deploying on a cloud, has become a far more acceptable way of working. GAE, for example, hasn't dropped all illusion of boundaries yet, each program definitely runs inside an instance, but they aren't very far away from being able to drop that fiction entirely.
Historically, the idea isn't new either, but the cloud as the new computer definitely has put a different spin on it. Daniel Ingalls, author of Design Principles Behind Smalltalk, wrote An operating system is a collection of things that don't fit into a language. There shouldn't be one.
The return for switching to this new model should be much better performance as all the layers are collapsed. Mirage reported better performance for database access, but this part of the equation needs a lot more proof, but considered on a $/CPU and $/IO basis, shifting mental models of what programs are and how they run, could pay off handsomely.
Related Articles
- Mirage - an open-source operating system for constructing secure, high-performance, reliable network applications across a variety of cloud computing and mobile platforms.
- A 'Fat-Free' Programming Framework for the Cloud by Chris Kanaracus.
- Good thread on Lambda the Ultimate. Thanks to Z-Bo for the Daniel Ingalls quote.
- Video of a talk on the paper at HotCloud. Unfortunately a password is required. If you get a password we can all use, please let me know.
- Papers related to Mirage
- Mesos - a platform for running multiple diverse cluster computing frameworks, such as Hadoop, MPI, and web services, on commodity clusters.
Reader Comments (9)
Shame the openmirage.org site is down, cloud computing for you! hehe
The openmirage.org links don't seem to work.
So from the paper it looks like they just moved their "application" into the kernel as a compiled module. That's not exactly unheard of. There are httpd servers that run as part of the kernel and are very fast, course they are also somewhat limited in what they can do.
Also their "security" section seems rather lacking since they talk about how they compile applications so they can be booted by the kernel (again, just a module, nothing special). But they seem to skip over the main point of using Xen in shared hosting which is being able to isolate applications/users from each other since one user won't own all of the applications on one machine. Unless they are doing something like chroot in the background (think BSD jails) and just don't mention it.
If you manage your own hardware I could see this as being useful. Hopefully their site will come back up so we can read in more details about how they would protect users from each other on shared systems, not to mention what is required to build a compiled kernel module.
jstephens,
Conceptually, think of a Mirage application as a piece of software that contains what you would normally consider the user-space code of an application, plus only those code paths through the kernel that you actually care about. It is a bootable entity in its own right, much as a Linux kernel+root filesystem is. The only place kernel *modules* are mentioned in the paper is under "Push the limits of packaging" in Section 5; EC2 doesn't allow custom kernels, so a Mirage application would have to bootstrap itself by acting as a kernel module, calling some function in that module from userspace, and having the module "install" itself over the running kernel by overwriting it. Just a hack for some cloud hosting providers, nothing fundamental.
Just like a kernel can be paravirtualized (such that it can talk to a hypervisor API instead of the bare metal for some functions) and run on top of Xen instead of directly on the hardware, so can a Mirage application. Any isolation benefits of the virtualization platform available to a paravirtualized kernel are also available to a paravirtualized Mirage app.
As for compilation, it seems relatively straightforward if you take the paravirtualization approach they do. Their persistence layer is essentially:
OCaml <-> SQLite <-> Xen <-> Host Filesystem
They wrote the SQLite<->Xen shim, and the rest is taken care of by other layers. If you wanted to run a Mirage app on bare metal, you'd probably grab something like the Linux kernel sources and compile in the parts your application ends up using. I'm sure there would be a lot of software engineering effort to make it work, but it doesn't seem impossible or fraught with security concerns unique to such an architecture.
Doesn't that just mean that your application has to implement an OS? I'm all for removing unnecessary layers of abstraction, but those layers are usually there for a reason. You'll need to implement multi-threading unless you do something very specific, you'll need drivers, you'll need all the other things. I'd rather not have to design and code all that.
Link to GitHub site Hooray for Google website caching?
Pies: the OS becomes a library, just like any other standard library you currently use, and won't be seen by the end programmer. You could achieve a similar aim by "statically linking" the Linux kernel to a single user-space application at compilation time, and depending on the compiler to remove unused code paths.
Another great improvement is that EC2 permits custom kernels now, and there are scripts in the Mirage repo which compile applications directly to AMI files that can run on Amazon. It works great, except that I can't figure out how to construct an EBS image to take advantage of the new Micro instances (which are perfect and economical for Mirage kernels).
PS: www.openmirage.org is still very unstable as we're using it as a self-hosting testing ground with a huge amount of debug output per URL retrieval. That's pre-alpha software for you :)
The similar approach for Erlang - http://erlangonxen.org
The CLR is not a VM. .Net has never been interpreted as far as I know. MSIL has JIT shortly after the app/dll is loaded.