Tumblr Architecture - 15 Billion Page Views a Month and Harder to Scale than Twitter
Monday, February 13, 2012 at 9:15AM
HighScalability Team in Example

With over 15 billion page views a month Tumblr has become an insanely popular blogging platform. Users may like Tumblr for its simplicity, its beauty, its strong focus on user experience, or its friendly and engaged community, but like it they do.

Growing at over 30% a month has not been without challenges. Some reliability problems among them. It helps to realize that Tumblr operates at surprisingly huge scales: 500 million page views a day, a peak rate of ~40k requests per second, ~3TB of new data to store a day, all running on 1000+ servers.

One of the common patterns across successful startups is the perilous chasm crossing from startup to wildly successful startup. Finding people, evolving infrastructures, servicing old infrastructures, while handling huge month over month increases in traffic, all with only four engineers, means you have to make difficult choices about what to work on. This was Tumblr’s situation. Now with twenty engineers there’s enough energy to work on issues and develop some very interesting solutions.

Tumblr started as a fairly typical large LAMP application. The direction they are moving in now is towards a distributed services model built around Scala, HBase, Redis, Kafka, Finagle,  and an intriguing cell based architecture for powering their Dashboard. Effort is now going into fixing short term problems in their PHP application, pulling things out, and doing it right using services.

The theme at Tumblr is transition at massive scale. Transition from a LAMP stack to a somewhat bleeding edge stack. Transition from a small startup team to a fully armed and ready development team churning out new features and infrastructure. To help us understand how Tumblr is living this theme is startup veteran Blake Matheny, Distributed Systems Engineer at Tumblr. Here’s what Blake has to say about the House of Tumblr:

Site:  http://www.tumblr.com/

Stats

Software

Hardware

Architecture

Old Tumblr

New Tumblr

Internal Firehose

Cell Design for Dashboard Inbox

On Being a Startup in New York

Team Structure

Software Deployment

Development

Hiring Process

Lessons learned


I’d like to thank Blake very much for the interview. He was very generous with his time and patient with his explanations. Please contact me if you would like to talk about having your architecture profiled.

Related Articles

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.