« Stuff The Internet Says On Scalability For February 14th, 2014 | Main | Paper: Network Stack Specialization for Performance »
Thursday
Feb132014

Snabb Switch - Skip the OS and Get 40 million Requests Per Second in Lua

Snabb Switch - a toolkit for solving novel problems in networking. If you are building a new packet-processing network appliance then you can use Snabb Switch to get the job done more quickly.

Here's a great impassioned overview from erichocean:

Or, you could just avoid the OS altogether: https://github.com/SnabbCo/snabbswitch

Our current engineering target is 1 million writes/sec and > 10 million reads/sec on top of an architecture similar to that, on a single box, to our fully transactional, MVCC database (write do not block reads, and vice versa) that runs in the same process (a la SQLite), which we've also merged with our application code and our caching tier, so we're down to—literally—a single process for what would have been at least three separate tiers in a traditional setup.

The result is that we had to move to measuring request latency in microseconds exclusively. The architecture (without additional application-specific processing) supports a wire-to-wire messaging speed of 26 nanoseconds, or approx. 40 million requests per second. And that's written in Lua!

To put that in perspective, that kind of performance is about 1/3 of what you'd need to be able to do to handle Facebook's messaging load (on average, obviously, Facebook bursts higher than the average at times...).

Point being, the OS is just plain out-of-date for how to solve heavy data plane problems efficiently. The disparity between what the OS can do and what the hardware is capable of delivering is off by a few orders of magnitude right now. It's downright ridiculous how much performance we're giving up for supposed "convenience" today.

Reader Comments (4)

where is the http parser .. if there is one ?

February 13, 2014 | Unregistered Commenterymo

"To put that in perspective, that kind of performance is about 1/3 of what you'd need to be able to do to handle Facebook's messaging load (on average, obviously, Facebook bursts higher than the average at times...)."

With data of what size? The same kind of size as Facebook's messaging system handles? You can get very impressive performance figures out of in-memory data, that suddenly become horrendous when disk is involved.

February 14, 2014 | Unregistered CommenterTwirrim

@Twirrim

Network throughput has got absolutely nothing to do with reading data from disk. You're supposed to be comparing this to something like a kernel networking stack, not talking about full application performance or caching or trying to "debunk" the figures by turning it into something it was never trying to be.

February 16, 2014 | Unregistered CommenterCraig

Thanks for this! If we have a unipurpose box then a general purpose OS isn't needed. Same thing for security, pagefaults, traps, etc. All to support multiple disparate functions and users on one set of hardware. We just need interrupts for "the code" no need to call it "user" or "kernel"

You mention messaging... we've been trying to decide if we should bypass the webserver layers entirely for some messaging that has high amounts of traffic. For example build our app so for some things clicking on a webpage "button" really sends a message from the app directly where it needs to go, instead of contacting the webserver, etc. Your thoughts?

May 6, 2014 | Unregistered Commenter@gwhiz

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>