« Stuff The Internet Says On Scalability For September 28, 2012 | Main | Google Spanner's Most Surprising Revelation: NoSQL is Out and NewSQL is In »
Wednesday
Sep262012

WordPress.com Serves 70,000 req/sec and over 15 Gbit/sec of Traffic using NGINX

This is a guest post by Barry Abrahamson, Chief Systems Wrangler at Automattic, and Nginx's Coufounder Andrew Alexeev.

WordPress.com serves more than 33 million sites attracting over 339 million people and 3.4 billion pages each month. Since April 2008, WordPress.com has experienced about 4.4 times growth in page views. WordPress.com VIP hosts many popular sites including CNN’s Political Ticker, NFL, Time Inc’s The Page, People Magazine’s Style Watch, corporate blogs for Flickr and KROQ, and many more. Automattic operates two thousand servers in twelve, globally distributed, data centers. WordPress.com customer data is instantly replicated between different locations to provide an extremely reliable and fast web experience for hundreds of millions of visitors.

Problem

WordPress.com, which began in 2005, started on shared hosting, much like all of the WordPress.org sites. It was soon moved to a single dedicated server and then to two servers. In late 2005, WordPress.com opened to the public and by early 2006 had expanded to four web servers, with traffic being distributed using round robin DNS. Soon thereafter WordPress.com expanded to a second data center and then to a third. It quickly became apparent that round robin DNS wasn't a viable long-term solution.

While hardware appliances like F5 BIG-IP's offered many features that WordPress.com required, the 5-member Automattic Systems Team decided to evaluate different options built on existing open source software. Using open source software on commodity hardware provides the ultimate level of flexibility and also comes with a cost savings—"Purchasing a pair of capable hardware appliances in a failover configuration for a single datacenter may be a little expensive, but purchasing and servicing 10 sets for 10 data centers soon becomes very expensive."

At first, the WordPress.com team chose Pound as a software load balancer because of its ease of use and built-in SSL support. After using Pound for about two years, WordPress.com required additional functionality and scalability, namely:

  • On-the-fly reconfiguration capabilities, without interrupting live traffic.
  • Better health check mechanisms, allowing to smoothly and gradually recover from a backend failure, without overloading application infrastructure with unexpected load of requests.
  • Better scalability—both requests per second, and the number of concurrent connections. Pound's thread-based model wasn’t able to reliably handle over 1000 requests per second per load balancing instance.

Solution

In April 2008 Automattic converted all WordPress.com load balancers from Pound to NGINX. Before that Automattic engineers had been using NGINX for Gravatar for a few months and were impressed by its performance and scalability, so moving WordPress.com over was the natural next step. Before switching WordPress.com to NGINX, Automattic evaluated several other products, including HAProxy, and LVS. Here are some of the reasons why NGINX was chosen:

  • Easy, flexible and logical configuration.
  • Ability to reconfigure and upgrade NGINX instances on-the-fly, without dropping user requests.
  • Application request routing via FastCGI, uwsgi or SCGI protocols; NGINX can also serve static content directly from storage for additional performance optimization.
  • The only software tested that was capable of reliably handling over 10,000 request per second of live traffic to WordPress applications from a single server.
  • NGINX’s memory and CPU footprints are minimal, and predictable. After switching to NGINX the CPU usage on the load balancing servers dropped three times.

Overall WordPress.com is serving about 70,000 req/sec and over 15 Gbit/sec of traffic from its NGINX powered load balancers at peak, with plenty of room to grow. Hardware configuration is Dual Xeon 5620 4 core CPUs with hyper-threading, 8-12GB of RAM, running Debian Linux 6.0. As part of high availability setup WordPress.com previously used Wackamole/Spread but has recently started to migrate to Keepalived. Even distribution of inbound requests across NGINX-based web acceleration and load balancing layer is based on DNS round-robin mechanism.

References

Reader Comments (16)

NGINX is going well since few years. With this amount of request per second and this minimal hardware, I am impressed.

I hope to see some day some kind of light server for .Net with this kind of result.

September 26, 2012 | Unregistered CommenterPatrick Desjardins

How does their database system work?

September 26, 2012 | Unregistered CommenterSeun Osewa

@Seun. I could not imagine that the bulk of the platform would need to serve dynamic data requiring a database hit. Once a blog post is live, it's essentially static. Serving static files/data is very easy and not costly at all. What is hard with this model is increasing density per server yet this is really just to lower cost at a very large scale and only worth the effort when the site reaches that size which wordpress has. In summary if serving static data is their problem, scaling the database is easy and doesn't require anything special. Master and slave or moving tables to dedicated servers basic stuff.

More dynamic the app, harder to scale the data layer.

September 26, 2012 | Unregistered CommenterDathan Vance Pattishall

I'd be more interested in the database sharding.

September 26, 2012 | Unregistered CommenterDarian Shimy

Do you really need 2000 servers for this amount of mostly static traffics? Or do they also act as a self hosted CDN, backup and redundancy included?

September 26, 2012 | Unregistered CommenterEd

Take a look at Barry's blog. There is quite a bit of information that he (and others) have shared about the WordPress.com environment.

WP.com seems to use HyperDB Replication over 550 mySQL Servers (as at July 2011) - Link

September 26, 2012 | Unregistered CommenterAndrew

Hi Darian,

What specifically would you like to know about database sharding? Happy to do a post about that.

September 26, 2012 | Unregistered CommenterBarry

Yep, we need all of the servers. The traffic is not mostly static. Keep in mind that 85% of images served from WordPress.com are transformed dynamically - on the fly. Believe me, if we didn't need that many servers I would happily get rid of them ;)

September 26, 2012 | Unregistered CommenterBarry

Wow! That's really good. I think Apache foundation needs to improve Apache now.

September 26, 2012 | Unregistered CommenterChankey Pathak

WordPress.com uses WordPress-multisite.
WordPress-multisite handles static files via PHP which can reduce your mileage with Nginx.

So to get wordpress.com like performance either put varnish in-front of nginx to cache static content OR use nginx-maps directive.

September 26, 2012 | Unregistered CommenterRahul Bansal

Also here is, one of my favorite video about scalability - http://2011.sf.wordcamp.org/session/ask-barry/.
Thanks Barry!

September 27, 2012 | Unregistered CommenterMustafa

What do they use to run the actual WordPress application? I've experimented with PHP-FPM behind nginx before with some success, but curious to know if they've got a full Apache stack behind nginx.

September 27, 2012 | Unregistered CommenterLuke L

We are using Nginx + PHP-FPM on our web/application servers.

September 27, 2012 | Unregistered CommenterBarry

Why did you pass up haproxy in favor of nginx for LB?
What do you use for GSLB?

September 27, 2012 | Registered Commentermxx

Just a question: is wordpress using nginx as load balancer, and behind had apache, or nginx is native?

Thanks

September 28, 2012 | Unregistered CommenterLord2y

@Lord2y : please read the answer from Barry, just two posts above yours :)

@Rahul Bansal : why using varnish in front of nginx ? You can cache with nginx...

@Barry : "Keep in mind that 85% of images served from WordPress.com are transformed dynamically - on the fly." : I don't understand that. What is transformed dynamically ? Anyway, if you need some stuff done to images (like scaling, compressing), it's probably done the first time the image is called, then the result is cached, or am I missing something ?

January 27, 2013 | Unregistered CommenterOlivier

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>