Friday
Nov122010
Stuff the Internet Says on Scalability For November 12th, 2010

- Google – A Study In Scalability And A Little Systems Horse Sense. A nice summary by Krishna Sankar of a version of Jeff Dean's classic talk on Google Scalability given to Stanford's EE380 class.
- Quotable Quotes:
- @jkalucki: Getting just 100 servers to work together for the first time is so ridiculously complicated. Horizontal scaling doesn't scale.
- @simeons: Yahoo's scalability is drivem by lots of asynchronous processing. "You learn to love it." -- @rstata Yahoo's CTO
- The Economics of the Cloud: Dissecting a Must-Read White Paper by Bernard Golden. I love the depiction of the unseen and unfelt forces that nevertheless organize everything around them: After a brief introduction, the authors lay out a central thesis: despite initial concerns about shortcomings in new technology offerings, "historically, underlying economics have a much stronger impact on the direction and speed of disruptions, as technological challenges are resolved or overcome through the rapid innovation we've grown accustomed to.
- Cassandra: What is HintedHandoff ? by Royans. The fun starts when a node, which could be the master for a range of keys, goes down.
- Hypertable Use Case: Tribalytic. One of the primary reasons we chose a scalable NoSQL database was so that the system would be architected to scale up as we grow. We can add more countries, increase sample sizes, and sustain more traffic without “hitting the wall”.
- ACID in Theory and Practice by Dan Weinreb. When you dig deep relational databases aren't that much more ACID compliant than NoSQL databases. Everyone compromises for speed.
- Which is faster: MySQL or MongoDB? Does it depend on the use case?. Good discussion of when to pick which type of system and why.
- Scalability as a Discipline by AFK Partners. While understanding the rules, patterns, and principles of scalability are completely achievable by anyone in the technology organization, this does not mean that they are widely known.
- Rackspace Cloud Servers versus Amazon EC2: Performance Analysis by Bitsource. All the usual caveats apply with benchmarks, but a very interesting discussion.
- Transparent query layer for MySQL by Robert Eisele.
- The video for the A InfiniteGraph Live Event: A NOSQL Evening in Palo Alto, with Tim Anglade, InfiniteGraph and Scality, is up.
- Adrian Cockcroft's NoSQL Netflix Use Case Comparison for Riak. Riak takes the Netflix quiz.
Reader Comments (1)
From the first entry "Google – A Study In Scalability And A Little Systems Horse Sense",
I really like the chart:
http://doubleclix.files.wordpress.com/2010/11/numberseveryoneshouldknow.png?w=570&h=425
This is what I always have in mind when I am working on fast path code.
Random tangent:
The mutex lock/unlock measurement is deceptive. Repeatedly acquiring a mutex from the same thread without contention is very different from contending for the same mutex across multiple cores and NUMA nodes.
It also ignores the overhead of bouncing cache lines across cores and NUMA nodes over a shared bus with limited bandwidth. The mutex is not the only expensive part of sharing state between cores.