Tuesday
Nov172009
10 eBay Secrets for Planet Wide Scaling
Tuesday, November 17, 2009 at 11:27AM
You don't even have to make a bid, Randy Shoup, an eBay Distinguished Architect, gives this presentation on how eBay scales, for free. Randy has done a fabulous job in this presentation and in other talks listed at the end of this post getting at the heart of the principles behind scalability. It's more about ideas of how things work and fit together than a focusing on a particular technology stack.
Impressive Stats
In case you weren't sure, eBay is big, with lots of: users, data, features, and change...
- Over 89 million active users worldwide
- 190 million items for sale in 50,000 categories
- Over 8 billion URL requests per day
- Hundreds of new features per quarter
- Roughly 10% of items are listed or ended every day
- In 39 countries and 10 languages
- 24x7x365
- 70 billion read / write operations / day
- Processes 50TB of new, incremental data per day
- Analyzes 50PB of data per day
10 Lessons
The presentation does a good job explaining each lesson, but the list is...
- Partition Everything - if you can't split it, you can't scale it. Split everything into manageable chunks by function and data.
- Asynchrony Everywhere - connect independent components through event queues
- Automate Everything - components should automatically adjust and the system should learn and improve itself.
- Remember Everything Fails - monitor everything, provide service even when parts start failing.
- Embrace Inconsistency - pick for each feature where you need to be on the CAP continuum, no distributed transactions, inconsistency can be minimized by careful operation ordering, become eventually consistent through async recovery and reconciliation.
- Expect (R)evolution - change is constant, design for extensibility, incrementally deploy changes.
- Dependencies Matter - minimize and control dependencies, use abstract interfaces and virtualization, components have an SLA, consumers responsible for recovering from SLA violations.
- Be Authoritative - Know which data is authoritative, which data isn't, and treat it accordingly.
- Never Enough Data - data drives finding optimization opportunities, predictions, recommendations, so save it all.
- Custom Infrastructure - maximize the utilization of every resource.
Related Articles
- eBay Related Posts on HighScalability
- Scalability Best Practices: Lessons from eBay by Randy Shoup
- Episode 109: eBay's Architecture Principles with Randy Shoup, transcript
Reader Comments (5)
I love designing large systems but can't even imagine 50PB of data analysis. Wow!
Ironically I came across this article on the day eBay experiences a massive backend failure relating to their search engine.
Any particular reason, apart from joins etc for using MySQL in memory engine , instead of memcache for personalization and session cache.
@Raj, durability would be my guess why not the memcache(d) you talk about.
what does "it's ..consumer’s responsibility to manage unavailability and SLA violations?" Shouldn't the service provider do everything possible to satisfy the availability guarantee in the SLA? I think that currently SPs are not doing enough in terms of managing availability, perhaps due to the fact that it's too difficult/costly for them. It's a lot easier for them to refund your money (or worse, ask you to restart your work) without paying a hefty penalty.