« Sponsored Post: Apple, Two Sigma, Cendea, RAMP, Blurocket, Incapsula, Dow Jones, Surge, Rackspace, aiCache, Aerospike, Percona, ScaleOut, New Relic, LogicMonitor, AppDynamics, ManageEngine, Site24x7 | Main | Stuff The Internet Says On Scalability For June 7, 2013 »
Monday
Jun102013

The 10 Deadly Sins Against Scalability

In the moral realm there may be 7 deadly sins, but scalability maven Sean Hull has come up with Five More Things Deadly to Scalability that when added to his earlier 5 Things That are Toxic to Scalability, make for a numerologically satisfying 10 sins again scalability:

  1. Slow Disk I/O – RAID 5 – Multi-tenant EBS. Use RAID 10, it provides  good protection along with good read and write performance. The design of RAID 5 means poor performance and long repair times on failure. On AWS consider Provisioned IOPS as a way around IO bottlenecks.
  2. Using the database for Queuing. The database may seem like the perfect place to keep work queues, but under load locking and scanning overhead kills performance. Use specialized products like RabbitMQ and SQS to remove this bottleneck.
  3. Using Database for full-text searching. Search seems like another perfect database feature. At scale search doesn't perform well. Use specialized technologies like Solr or Sphinx.
  4. Insufficient Caching at all layers. Use memcache between your application and the database. Use a page like cache like Varnish between users and your webserver. Select proper caching options for your html assets.
  5. Too much technical debt. Rewrite problem code instead of continually paying a implementation tax for poorly written code. In the long run it pays off.
  6. Object Relational Mappers. Create complex queries that hard to optimize and tweak.
  7. Synchronous, Serial, Coupled or Locking Processes. Locks are like stop signs, traffic circles keep the traffic flowing. Row level locking is better than table level locking. Use async replication. Use eventual consistency for clusters.
  8. One Copy of Your Database. A single database server is a choke point. Create parallel databases and let a driver select between them.
  9. Having No Metrics. Visualize what's happening to your system using one of the many monitoring packages.
  10. Lack of Feature Flags. Be able to turn off features via a flag so when a spike hits features can be turned off to reduce load.

Reader Comments (13)

Great check list. Thanks for sharing.

June 11, 2013 | Unregistered CommenterSteve Jones

Thank you for the post and sharing some great links and ideas.

I'm curious about the image, is there a link to the full sized image somewhere?

June 11, 2013 | Unregistered CommenterBryce Verdier

did you mean locking at db level is the only sin, not in general sense?

June 12, 2013 | Unregistered Commenterabhishek manocha

Very efficient and nice article, thanks.
Regarding the 6th item, you don't recommend using very popular packages like entity framework or xhibernate at all?

June 12, 2013 | Unregistered Commentermohammad

good article, can we get a full size image?

June 12, 2013 | Unregistered Commenterblaf

find it here! http://visual.ly/circles-hell-dantes-inferno

June 12, 2013 | Unregistered CommenterSushiX

Would service broker not scale as a mechanism for queues?

June 17, 2013 | Unregistered CommenterElliot

Are these really the 10 deadly sins of scalability? I would think that having an unlimited growth (default setting) SQL database file would be one of the 10 deadly sins of scalability. In fact most DBAs probably don't archive data and remove it from their ever growing database. The result ends up being longer and longer restore times, indexing, etc.. to the point where its very expensive to manage.

June 17, 2013 | Unregistered CommenterCodechimp

I agree, but the original sin is when you think scalability is a production issue rather than a design consideration. Most if not all of the "sins"/issues you mention will occur when the system's builders are not mindful of scalability, or planning to "leave it for later", or management is focusing on functional scope at the expense of a future-proof architecture.

July 8, 2013 | Unregistered CommenterItamar Haber

Also for a search solution one can use ElasticSearch (http://www.elasticsearch.org/) I haven't used it myself, but I have seen a few articles (Google!) of people migrating from Solr to ElasticSearch and getting better performance.

July 31, 2013 | Unregistered CommenterNS du Toit

Re: 1. Slow Disk I/O – RAID 5 – Multi-tenant EBS.
Using legacy RAID models like RAID 10, will lead to data loss (mirroring is poor protection and a waste of space); I suggest ZFS ZRAID5 with hot spares, it is fast, and repairs will have limited effect on throughput unlike legacy RAID 5 performance degrades.

July 31, 2013 | Unregistered CommenterInfernoz

Re: 5. Object Relational Mappers.
Agreed; I much prefer using lighter template mapping SQL frameworks like MyBatis than bloated ORM like Hibernate.
I still like some kind of framework, because using raw API SQL calls requires a lot of boiler-plate code, where the quality can be rather lacking e.g. no connection pools, poor resource clean up, poor error handling, and poor or even no logging!

July 31, 2013 | Unregistered CommenterInfernoz

Very well written and unique article. Thanks for sharing. :)

October 15, 2013 | Unregistered CommenterMBD Singh

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>