Entries in erlang (6)

Thursday
Apr252013

Paper: Making reliable distributed systems in the presence of software errors

Joe Armstrong is a co-inventor of Erlang and general all around renaissance software tinkerer as shown by his excellent work on writing a C Compiler and his voluminous work on GitHub.

Given the success of Erlang it's probably no surprise that he wrote his thesis on the ground breaking ideas behind Erlang: Making reliable distributed systems in the presence of software errors.

Even if you have yet to join the cult of Erlang the principles behind Erlang are universal and well worth exploring for your own designs. Highly recommended.

Introduction:

Click to read more ...

Monday
Nov262012

BigData using Erlang, C and Lisp to Fight the Tsunami of Mobile Data

This is a guest post by Jon Vlachogiannis. Jon is the founder and CTO of BugSense.

BugSense, is an error-reporting and quality metrics service that tracks thousand of apps every day. When mobile apps crash, BugSense helps developers pinpoint and fix the problem. The startup delivers first-class service to its customers, which include VMWare, Samsung, Skype and thousands of independent app developers. Tracking more than 200M devices requires fast, fault tolerant and cheap infrastructure.

The last six months, we’ve decided to use our BigData infrastructure, to provide the users with metrics about their apps performance and stability and let them know how the errors affect their user base and revenues.

We knew that our solution should be scalable from day one, because more than 4% of the smartphones out there, will start DDOSing us with data.

We wanted to be able to:

  • Abstract the application logic and feed browsers with JSON
  • Run complex algorithms on the fly
  • Experiment with data, without the need of a dedicated Hadoop cluster
  • Pre-process data and then store them (cutting down storage)
  • Be able to handle more than 1000 concurrent request on every node
  • Make “joins” in more than 125M rows per app
  • Do this without spending a fortune in servers

The solution uses:

  • Less than 20 large instances running on Azure
  • An in-memory database
  • A full blown custom LISP language written in C to implement queries, which is many times faster that having a VM (with a garbage collector) online all the time
  • Erlang for communication between nodes
  • Modified TCP_TIMEWAIT_LEN for an astonishing drop of 40K connections, saving on CPU, memory and TCP buffers
For more information on the BugSense architecture please keep on reading...

Click to read more ...

Wednesday
Dec172008

Ringo - Distributed key-value storage for immutable data

Ringo is an experimental, distributed, replicating key-value store based on consistent hashing and immutable data. Unlike many general-purpose databases, Ringo is designed for a specific use case: For archiving small (less than 4KB) or medium-size data items (<100MB) in real-time so that the data can survive K - 1 disk breaks, where K is the desired number of replicas, without any downtime, in a manner that scales to terabytes of data. In addition to storing, Ringo should be able to retrieve individual or small sets of data items with low latencies (<10ms) and provide a convenient on-disk format for bulk data access. Ringo is compatible with the map-reduce framework Disco and it was started at Nokia Research Center Palo Alto.

Click to read more ...

Wednesday
Sep032008

MapReduce framework Disco

Disco is an open-source implementation of the MapReduce framework for distributed computing. It was started at Nokia Research Center as a lightweight framework for rapid scripting of distributed data processing tasks. The Disco core is written in Erlang. The MapReduce jobs in Disco are natively described as Python programs, which makes it possible to express complex algorithmic and data processing tasks often only in tens of lines of code.

Click to read more ...

Wednesday
May142008

New Facebook Chat Feature Scales to 70 Million Users Using Erlang

UpdateErlang at Facebook by Eugene Letuchy. How Facebook uses Erlang to implement Chat, AIM Presence, and Chat Jabber support. 

I've done some XMPP development so when I read Facebook was making a Jabber chat client I was really curious how they would make it work. While core XMPP is straightforward, a number of protocol extensions like discovery, forms, chat states, pubsub, multi user chat, and privacy lists really up the implementation complexity. Some real engineering challenges were involved to make this puppy scale and perform. It's not clear what extensions they've implemented, but a blog entry by Facebook's Eugene Letuchy hits some of the architectural challenges they faced and how they overcame them.

A web based Jabber client poses a few problems because XMPP, like most IM protocols, is an asynchronous event driven system that pretty much assumes you have a full time open connection. After logging in the server sends a client roster information and presence information. Your client has to be present to receive the information. If your client wants to discover the capabilities of another client then a request is sent over the wire and some time later the response comes back. An ID is used to map the reply to the request. All responses are intermingled. IM messages can come in at any time. Subscription requests can come in at any time.

Facebook has the client open a persistent connection to the IM server and uses long polling to send requests and continually get data from the server. Long polling is a mixture of client pull and server push. It works by having the client make a request to the server. The client connection blocks until the server has data to return. When it does data is returned, the client processes it, and then is in position to make another request of the server and get any more data that has queued up in the mean time. Obviously there are all sorts of latency, overhead, and resource issues with this approach. The previous link discusses them in more detail and for performance information take a look at Performance Testing of Data Delivery Techniques for AJAX Applications by Engin Bozdag, Ali Mesbah and Arie van Deursen.

From a client perspective I think this approach is workable, but obviously not ideal. Your client's IMs, presence changes, subscription requests, and chat states etc are all blocked on the polling loop, which wouldn't have a predictable latency. Predictable latency can be as important as raw performance.

The real scaling challenge is on the server side. With 70 million people how do you keep all those persistent connections open? Well, when you read another $100 million was invested in Facebook for hardware you know why. That's one hella lot of connections. And consider all the data those IM servers must store up in between polling intervals. Looking at the memory consumption for their servers would be like watching someone breath. Breath in- streams of data come in and must be stored waiting for the polling loop. Breath out- the polling loops hit and all the data is written to the client and released from the server. A ceaseless cycle. In a stream based system data comes in and is pushed immediately out the connection. Only socket queue is used and that's usually quite sufficient. Now add network bandwidth for all the XMPP and TCP protocol overhead and CPU to process it all and you are talking some serious scalability issues.

So, how do you handle all those concurrent connections? They chose Erlang. When you first hear Erlang and Jabber you think ejabberd, an open source Erlang based XMPP server. But since the blog doesn't mention ejabberd it seems they haven't used it .

Why Erlang? First, the famous Yaws vs Apache shootout where "Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections." Erlang is naturally good at solving high concurrency problems. Yet following the rule that no benchmark can go unchallenged, Erik Onnen calls this the Worst Measurement Ever and has some good reasoning behind it.

In any case, Erlang does nicely match the problem space. Erlang's approach to a concurrency problem is to throw a very light weight Erlang process at each state machine you want to be concurrent. Code-wise that's more natural than thread pools, async IO, or thread per connection systems. Until Linux 2.6 it wasn't even possible to schedule large numbers of threads on a single machine. And you are still devoting a lot of unnecessary stack space to each thread. Erlang will make excellent use of machine resources to handle all those connections. Something anyone with a VPS knows is hard to do with Apache. Apache sucks up memory with joyous VPS killing abandon.

The blog says C++ is used to log IM messages. Erlang is famously excellent for its concurrency prowess and equally famous for being poor at IO, so I imagine C++ was needed for efficiency.

One of the downsides of multi-language development is reusing code across languages. Facebook created Thrift to tie together the Babeling Tower of all their different implementation languages. Thrift is a software framework for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, and Ruby. Another approach might be to cross language barriers using REST based services.

A problem Facebook probably doesn't have to worry about scaling is the XMPP roster (contact list). Handling that many user accounts would challenge most XMPP server vendors, but Facebook has that part already solved. They could concentrate on scaling the protocol across a bunch of shiny new servers without getting bogged down in database issues. Wouldn't that be nice :-) They can just load balance users across servers and scalability is solved horizontally, simply by adding more servers. Nice work.

Saturday
May102008

Hitting 300 SimbleDB Requests Per Second on a Small EC2 Instance

High Performance Multithreaded Access to Amazon SimpleDB is a great follow up to the idea in How SimpleDB Differs from a RDBMS that more programming is the price paid for performance in SimpleDB. It shows how much work and infrastructure is required to batter better performance out of SimpleDB. Remember, in SimpleDB you get keys to records from queries so if you want to get all the fields for records you need to make separate requests. Since SimpleDB isn't exactly a speed daemon the obvious strategy is to parallelize. Even if a job takes a 100 msecs you can get a lot done in a little time if you can execute enough jobs in parallel. Parallelization is the approach taken by Haakon@AWS in his Java code example of how to get the most out of SimpleDB. You can find the code at Indexing and Querying Amazon S3 Metadata with Amazon SimpleDB. We'll also consider how a back-end service architecture built on Erlang may be a better fit with cloud computing. Two general mechanisms of parallelism are available: threads and boxes. To get the most bang out of a single machine you need threads (events, etc). To scale beyond the load handled by a single machine you need multiple boxes. The example code uses the Executor Thread Pool for parallelism within a program. Thread pools are a pretty common idiom by now. Amazon's queue service SQS was used to distribute work amongst boxes. Work was queued to SQS in batches of 1000 work items. The items were pulled by the thread pool and processed. Why 1000? The idea is to balance processing overhead with work overhead. You don't want popping items off SQS to dominate your processing time so you have to do enough work in each pass to make it worth the investment. The architecture uses two thread pools: one to run queries and one to get record values. Applications must carefully tune the number of threads in each pool so the queries to overwhelm the gets. Using a query thread pool with 2 threads and a get thread pool with 32 threads it was possible to perform 300 TPS on a small EC2 instances. Theoretically the advantage of this architecture is that it will scale to any size you need. SQS is your work distribution backbone and you just spin up the number of thread pool instances you need. The disadvantage is that this is a lot of programmer effort. But let's consider that you had to do some serious processing on each record, you would need something like this approach anyway to scale out the processing. But to perform simple aggregation operations it's total overkill which is why more time needs to be spent on the write site of the equation in SimpleDB/BigTable than the read side as we are used to with a RDBMS. What's the best way to go parallel? On the front-end life is simple. Go shared nothing and compose your pages from scalable back-end services. This is how Amazon does it and it's how Google AppEngine does it. GAE completely punts on the back-end service layer architecture. Unfortunately we still need to create a back-end architecture for more complex applications. Thread pools and SQS is one parallelization approach. Instead of thread pools something like Java's fork/join framework could be used. Initially I thought piling on more low level primitive threading facilities into Java was the wrong way to go. Yes, it is a "'multicore-friendly lightweight parallel framework' that supports a style of parallel programming where problems are recursively split into smaller fragments, solved in parallel and recombined," but it's also a style of programming that is very difficult to program correctly. If cloud architectures will rely on these primitives for efficiency then I think we have regressed. Erlang style architectures described by Luke Hoersten in Scalable Web Apps: Erlang + Python is a simpler more reliable to programming model. An event driven actor based approach is much harder to screw up than closely cooperating threads in a shared memory space. Erlang originally ran in embedded systems where the requirement was to reliably squeeze the most work possible out of limited CPU and other compute resources. Oddly enough the embedded node of old closely parallels your basic cloud VM. Start your work horse Erlang (or other similar system) instances and let them efficiently chew up your work loads. Erlang's scheduling model fits perfectly with a service centric job engine cloud instance. It will get more work done then your typical thread based system ever would.

Click to read more ...