« Webcast: Advanced Database High Availability and Scalability Solutions | Main | Welcome to High Scalability »
Monday
Jul092007

LiveJournal Architecture

A fascinating and detailed story of how LiveJournal evolved their system to scale. LiveJournal was an early player in the free blog service race and faced issues from quickly adding a large number of users. Blog posts come fast and furious which causes a lot of writes and writes are particularly hard to scale. Understanding how LiveJournal faced their scaling problems will help any aspiring website builder.

Site: http://www.livejournal.com/

Information Sources

  • LiveJournal - Behind The Scenes Scaling Storytime
  • Google Video
  • Tokyo Video
  • 2005 version

    Platform

  • Linux
  • MySql
  • Perl
  • Memcached
  • MogileFS
  • Apache

    What's Inside?

  • Scaling from 1, 2, and 4 hosts to cluster of servers.
  • Avoid single points of failure.
  • Using MySQL replication only takes you so far.
  • Becoming IO bound kills scaling.
  • Spread out writes and reads for more parallelism.
  • You can't keep adding read slaves and scale.
  • Shard storage approach, using DRBD, for maximal throughput. Allocate shards based on roles.
  • Caching to improve performance with memcached. Two-level hashing to distributed RAM.
  • Perlbal for web load balancing.
  • MogileFS, a distributed file system, for parallelism.
  • TheSchwartz and Gearman for distributed job queuing to do more work in parallel.
  • Solving persistent connection problems.

    Lessons Learned

  • Don't be afraid to write your own software to solve your own problems. LiveJournal as provided incredible value to the community through their efforts.

  • Sites can evolve from small 1, 2 machine setups to larger systems as they learn about their users and what their system really needs to do.

  • Parallelization is key to scaling. Remove choke points by caching, load balancing, sharding, clustering file systems, and making use of more disk spindles.

  • Replication has a cost. You can't just keep adding more and more read slaves and expect to scale.

  • Low level issues like which OS event notification mechanism to use, file system and disk interactions, threading and even models, and connection types, matter at scale.

  • Large sites eventually turn to a distributed queuing and scheduling mechanism to distribute large work loads across a grid.
  • Reader Comments (9)

    oo nice. livejournal is best.

    December 31, 1999 | Unregistered Commenteryoutube

    If liive journal continues its reputation like this it will also lead like the major journals such as the time and others.
    -----

    http://underwaterseaplants.awardspace.com">Underwater sea plants
    http://underwaterseaplants.awardspace.com/seaweed.htm">Seaweed...http://underwaterseaplants.awardspace.com/easyaquariumplants.htm">Easy aquarium plants

    December 31, 1999 | Unregistered Commenterfarhaj

    i cant believe ive never heard of livejournal before
    i use wordpress but i think i will try this out thanks

    December 31, 1999 | Unregistered Commenterwinkbingo

    thanks, very informative post for me.

    December 31, 1999 | Unregistered CommenterHidden object games

    Livejournal.com appears to have less traffic than wordpress.com I wonder how the architecture compares between typepad.com, wordpress.com and livejournal.com since they all provide the same service and have many users.

    December 31, 1999 | Unregistered CommenterInternet Marketing Company

    thanks, very informative post for me.

    December 31, 1999 | Unregistered Commentererotik izle

    Where do you get DRBD from? LiveJournal does NOT use DRBD.

    December 31, 1999 | Unregistered CommenterAnonymous

    I really appreciate the information here presented and hope you can keep us well inform in future posts. Thanks.

    December 31, 1999 | Unregistered CommenterR6 Fairings

    Man you present us a great source of information. The list of links you present us is really usefull! I give you 9/10 for this post! Great work!

    December 31, 1999 | Unregistered CommenterArchie

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>