« Hot Scalability Links for Oct 15 2009 | Main | High Performance at Massive Scale – Lessons learned at Facebook »
Tuesday
Oct132009

Why are Facebook, Digg, and Twitter so hard to scale?

Real-time social graphs (connectivity between people, places, and things). That's why scaling Facebook is hard says Jeff Rothschild, Vice President of Technology at Facebook. Social networking sites like Facebook, Digg, and Twitter are simply harder than traditional websites to scale. Why is that? Why would social networking sites be any more difficult to scale than traditional web sites? Let's find out.

Traditional websites are easier to scale than social networking sites for two reasons:

  • They usually access only their own data and common cached data.
  • Only 1-2% of users are active on the site at one time.

Imagine a huge site like Yahoo. When you come to Yahoo they can get your profile record with one get and that's enough to build your view of the website for you. It's relatively straightforward to scale systems based around single records using distributed hashing schemes. And since only a few percent of the people are on the site at once it takes comparatively little RAM cache to handle all the active users.

Now think what happens on Facebook. Let's say you have 200 friends. When you hit your Facebook account it has to go gather the status of all 200 of your friends at the same time so you can see what's new for them. That means 200 requests need to go out simultaneously, the replies need to be merged together, other services need to be contacted to get more details, and all this needs to be munged together and sent through PHP and a web server so you see your Facebook page in a reasonable amount of time. Oh my.

There are several implications here, especially given that on social networking sites a high percentage of users are on the system at one time (that's the social part, people hang around):

  • All data is active all the time. 
  • It's hard to partition this sort of system because everyone is connected. 
  • Everything must be kept in RAM cache so that the data can be accessed as fast as possible.

Partitioning means you would like to find some way to cluster commonly accessed data together so it can be accessed more efficiently. Facebook, because of the interconnectedness of the data, didn't find any clustering scheme that worked in practice. So instead of partitioning and denormalizing data Facebook keeps data normalized and randomly distributes data amongst thousands of databases.

This approach requires a very fast cache. Facebook uses memcached as their caching layer. All data is kept in cache and they've made a lot of modifications to memcached to speed it up and to help it handle more requests (all contributed back to the community).

Their caching tier services 120 million queries every second and it's the core of the site. The problem is memcached is hard to use because it requires programmer cooperation. It's also easy to corrupt. They've developed a complicated system to keep data in the caching tier consistent with the database, even across multiple distributed data centers. Remember, they are caching user data here, not HTML pages or page fragments. Given how much their data changes it's would be hard to make page caching work.

We see similar problems at Digg. Digg, for example, must deal with the problem of sending out updates to 40,000 followers every time Kevin Rose diggs a link. Digg and I think Twitter too have taken a different approach than Facebook. 

Facebook takes a Pull on Demand approach. To recreate a page or a display fragment they run the complete query. To find out if one of your friends has added a new favorite band Facebook actually queries all your friends to find what's new. They can get away with this but because of their awesome infrastructure.  

But if you've ever wondered why Facebook has a 5,000 user limit on the number of friends, this is why. At a certain point it's hard to make Pull on Demand scale.

Another approach to find out what's new is the Push on Change model. In this model when a user makes a change it is pushed out to all the relevant users and the changes (in some form) are stored with each user. So when a user want to view their updates all they need to access is their own account data. There's no need to poll all their friends for changes.

With security and permissions it can be surprisingly complicated to figure out who should see an update. And if a user has 2 million followers it can be surprisingly slow as well. There's also an issue of duplication. A lot of duplicate data (or references) is being stored, so this is a denormalized approach which can make for some consistency problems. Should permission be consulted when data is produced or consumed, for example? Or what if the data is deleted after it has already been copied around?

While all these consistency and duplications problems are interesting, Push on Change seems the more scalable approach for really large numbers of followers. It does take a lot of work to push all the changes around, but that can be handled by a job queuing system so the work is distributed across a cluster.

The challenges will only grow as we get more and more people, more and deeper inter-connectivity, faster and faster change, and a greater desire to consume it all in real-time. We are a long way from being able to handle this brave new world.

Reader Comments (20)

Very nice overview about the whole scalability problem for social websites. However I am sure push model is the next thing, specially using protocols like XMPP outside of classic chat applications.

October 13, 2009 | Unregistered CommenterAbhinav Singh

Thanks for great article.

I always wonder how Facebook gets away with the giant interconnected database.

October 13, 2009 | Unregistered Commenterrunning

"Their caching tier services 120 million queries every second and it's the core of the site."

Can you site the reference for that? Is it in his presentation? If so, I will go watch it.

October 13, 2009 | Unregistered Commenterlr

The web industry can learn from the financial industry about this. Stock markets have been messaging stock prices with very low latency to huge amounts of nodes for years. There are various commercial products such as Tibco Rendezvous that provide such a message bus. Pushing out updates to 40,000 followers (hosted on a lot fewer partitions probably) should not be a really challenge then..

October 13, 2009 | Unregistered CommenterRobin

Facebook's restrictions also mirror the natural limits on our own attention economies: no one can possibly know (or retain) everything about 5000 other folks.

So long as web architectures need only to scale around our feeble human brains, we are in fairly safe territory.

October 13, 2009 | Unregistered CommenterMichael E. Driscoll

I like this article, very informative. Distributed systems and caching are always my topic of interest :)

October 13, 2009 | Unregistered CommenterHabib Ullah Bahar

Good post.

October 13, 2009 | Unregistered CommenterOliver Nassar

Hmm..interesting points.
That touched upon some important points about scalability that affects such heavily visited sites.

October 14, 2009 | Unregistered CommenterVivek

Hi,

We have a recent paper that deals with scaling of OSNs, and we deal with
some issues you mention - including that of partitioning.

The paper can be found at: http://bit.ly/18Gqmx

Please feel free to contact the authors for more info.

Cheers,
vijay

October 14, 2009 | Unregistered CommenterVijay

Great topic and explanation of the problem. If you are interested in an excellent way to solve this problem, check out the attached link and webinar.
http://budurl.com/oct15wbnr

October 14, 2009 | Unregistered CommenterChris Mureen

Nice thought-provoking article. I think I have an idea what to do with this "gather the status of all 200 of your friends at the same time". Simply don't. People are not likely to want to or even be capable of looking at that size of data all at once, so supply it in chunks that are not overwhelming.

October 14, 2009 | Unregistered CommenterKalengi

Nice article but you're wrong about the 5,000 user limit. I remember watching a talk by a facebook engineer a while back (I think it was the Jeff Hammerbacher one on Yahoo Brickhouse) where he said that the limit was a "product decision", not an engineering one. He also said that fan pages use the exact same model as regular people and there's no limit there. So basically the 5,000 friend limit is completely arbitrary and they could remove it tomorrow without any problem.

October 16, 2009 | Unregistered CommenterGuillaume Theoret

Interesting and very insightful. I have been looking for answers to the scalability issue since quite some time and have even blogged about it,but this is the best explanation I have come across.

October 17, 2009 | Unregistered Commenterdanishctc

Great post. It's when we deal with scaling these large sites software architecture really becomes interesting :)

October 17, 2009 | Unregistered CommenterBjartN

Firstly , the approach that facebook and dig have been using are the best they can .
However , their effort to find a more optimized solution may trigger new invention for data management. as you know the technology is developed the most at wars, this is the war for the money in the high competing market :)

October 19, 2009 | Unregistered CommenterBilal

Yes I buy this for facebook, flickr, digg, etc. But not for twitter. They have a relatively simple data structure and a very small database. They recently claimed that they had at peak 5,000 messages per second during the michael jackson event. Even if I was generous and said that they averaged 5,000 messages per second continuously, then you would still only get 20Tb of messages being added a year. Sure they have a lot of requests, but this is not a lot of data to draw out. If they only bothered to implement some form of realtime push rather than making people poll, they might find considerably less api calls made as well. Twitter don't have as much excuse for all of their recent outages compared to, say, Facebook.

Patrick.

October 20, 2009 | Unregistered CommenterPatrick

Sounds like it is time for someone to build social networking straight into hardware. Perhaps a social networking ASIC?

October 21, 2009 | Unregistered CommenterRob

The social networking problem in the extreme eventually becomes closely related to the classic problem facing neural networks and other massively parallel computation problems - factorial expansion of interconnections. Each additional node N increases the number of potential connections by a factor of N-1. In the brain, a typical neuron connects with about 10,000 receivers and 10,000 senders so multiplying factor is only on the order of 10,000, rather than billions. So the Facebook limit is in a similar order of magnitude. Interestingly, the packet-switched network has much better scalability than a hardwired switch, so an ASIC is probably not the way to go unless it implements a network-like bus layer. That is a design that I have been studying.

Biological neural networks additionally solve this in part by a ruthless pruning of interconnects that do not carry a significant signal, while neurons whose 'votes' are not used regularly (their outputs are not weighted highly by any receiving neurons), can actually die.

October 23, 2009 | Unregistered CommenterGary B

Sadly, users don't care and are intolerant to visible bugs caused by scalability (or the lack of it) issues.

August 24, 2010 | Unregistered CommenterTaiTran

Very interesting article. Thank!

March 12, 2011 | Unregistered CommenterAlejandro Hidalgo

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>