« How Vimeo Saves 50% on EC2 by Playing a Smarter Game | Main | Sponsored Post: NY Times, CouchConf, Surge, FiftyThree, ROBLOX, Percona, ElasticHosts, Atlantic.Net, ScaleOut, New Relic, NetDNA, GigaSpaces, AiCache, Logic Monitor, AppDynamics, CloudSigma, ManageEngine, Site24x7 »
Wednesday
Sep192012

The 4 Building Blocks of Architecting Systems for Scale

If you are looking for an excellent overview of general architecture principles then take a look at Will Larson's Introduction to Architecting Systems for Scale. Based on his experiences at Yahoo! and Digg, Will covers key concepts in some depth. A quick gloss on the building blocks:

  1. Load Balancing: Scalability & Redundancy. Horizontal scalability and redundancy are usually achieved via load balancing, the spreading of requests across multiple resources.
    • Smart Clients. The client has a list of hosts and load balances across that list of hosts. Upside is simple for programmers. Downside is it's hard to update and change.
    • Hardware Load Balancers. Targeted at larger companies, this is dedicated load balancing hardware. Upside is performance. Downside is cost and complexity.
    • Software Load Balancers. The recommended approach, it's  software that handles load balancing, health checks, etc.
  2. Caching. Make better use of resources you already have. Precalculate results for later use. 
    1. Application Versus Database Caching. Databases caching is simple because the programmer doesn't have to do it. Application caching requires explicit integration into the application code.
    2. In Memory Caches. Performs best but you usually have more disk than RAM.
    3. Content Distribution Networks. Moves the burden of serving static resources from your application and moves into a specialized distributed caching service.
    4. Cache Invalidation. Caching is great but the problem is you have to practice safe cache invalidation. 
  3. Off-Line Processing. Processing that doesn't happen in-line with a web requests. Reduces latency and/or handles batch processing. 
    1. Message Queues. Work is queued to a cluster of agents to be processed in parallel.
    2. Scheduling Periodic Tasks. Triggers daily, hourly, or other regular system tasks. 
    3. Map-Reduce. When your system becomes too large for ad hoc queries then move to using a specialized data processing infrastructure.
  4. Platform Layer. Disconnect application code from web servers, load balancers, and databases using a service level API. This makes it easier to add new resources, reuse infrastructure between projects, and scale a growing organization. 

Reader Comments (2)

That is astonishingly similar (besides point 1) to the series from Lecloud (http://www.lecloud.net/search/scalability). I take their one for my students to learn the core principles of building a scalable web architecture.

September 19, 2012 | Unregistered CommenterPeter Meyers

Great article to get started!!

September 26, 2017 | Unregistered CommenterKomal Krishna

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>