« Newbie in scalability design issues | Main | How Flickr Handles Moving You to Another Shard »
Sunday
Oct142007

Product: The Spread Toolkit

Complex applications coordinating work across a lot of machines often need a highly performing fault tolerant message layer. Though a blast to write, it's probably a better use of your time to use an off the shelf solution. And that's where Spread comes in. Flickr, for example, uses Spread to create real-time event feeds from their web server logs. What exactly is Spread?


From the Spread website:


Spread is an open source toolkit that provides a high performance messaging service that is resilient to faults across local and wide area networks. Spread functions as a unified message bus for distributed applications, and provides highly tuned application-level multicast, group communication, and point to point support. Spread services range from reliable messaging to fully ordered messages with delivery guarantees.

Spread can be used in many distributed applications that require high reliability, high performance, and robust communication among various subsets of members. The toolkit is designed to encapsulate the challenging aspects of asynchronous networks and enable the construction of reliable and scalable distributed applications.

Some of the services and benefits provided by Spread:
  • Reliable and scalable messaging and group communication.
  • A very powerful but simple API simplifies the construction of distributed architectures.
  • Easy to use, deploy and maintain.
  • Highly scalable from one local area network to complex wide area networks.
  • Supports thousands of groups with different sets of members.
  • Enables message reliability in the presence of machine failures, process crashes and recoveries, and network partitions and merges.
  • Provides a range of reliability, ordering and stability guarantees for messages.
  • Emphasis on robustness and high performance.
  • Completely distributed algorithms with no central point of failure.


  • In Building Scalable Web Sites Cal Henderson describes how Flickr uses Spread to create a log of real-time events, like photos uploaded and discussions started, as they happen. Spread is connected to their web servers. As photos are uploaded these web server events are messaged in real-time to agents consuming the feed.

    The advantage of this architecture is it sheds load away from the database. Otherwise the database would have to be continuously polled for new events by each agent.

    Related Articles

     

  • LAMP and the Spread Toolkit
  • The Spread Toolkit: Architecture and Performance
  • Reader Comments (11)

    I can definitely see where this would be very useful in many logging scenarios. It's common to need near real-time stats for critical web applications, especially large ones, and traditional logging methods such as log files and databases tend to have issues.

    Todd, where else are you seeing this deployed? For what types of problems?

    --
    Dustin Puryear
    Author, "Best Practices for Managing Linux and UNIX Servers"
    http://www.puryear-it.com/pubs/linux-unix-best-practices

    December 31, 1999 | Unregistered CommenterAnonymous

    Does anybody knows / uses a good PHP library with Spread?

    December 31, 1999 | Unregistered CommenterdH

    The http://pecl.php.net/package/spread">PECL spread package is quite good for PHP.

    Also, Spread is used in many places. For example, we use it at Technorati.

    December 31, 1999 | Unregistered CommenterRyan King

    Hi,

    I've tried to use the PECL Spread package before, without any success.
    First of all, if I install it
    pecl install channel://pecl.php.net/spread-1.1
    it will appears under PEAR instead of PECL
    ("pear list" will show the spread package instead of "pecl list").
    If I go to the PHP extensions directory, instead of a ".so" file there is just unpackaged sources of php_spread.
    I think the PECL version of the php_spread library just can't compile / can't works well.
    I've tried to patch it in many ways to make it compile / run well ( check the bugs: http://pecl.php.net/bugs/bug.php?id=9349&edit=1 ) but after a while I gave it up and I've switched to ActiveMQ.
    Please let me know how did you managed to use Spread with PHP since it's still an interesting message bus.
    Thank you!

    December 31, 1999 | Unregistered CommenterdH

    Great post! Totally agree on the fact that organizations with large tech ops often overlook messaging as a *phenomenally* flexible and reliable solution, and keep hitting the database (or worse). It's probably because most architects come from development, not from infrastructure engineering :) Great to know that benefits of messaging were not lost on likes of Flickr and Technorati...

    Specifically re spread however, I would like to point out that there has been no release for almost a year. In Python, I remember I had to apply a patch to its client lib in order to make it work with version 4. The patch was from the mailing list, it was not included in main distro (not a big deal, but still...)

    As an alternative to spread, you might want to check out AMQP standard. www.amqp.org. It's an open standard, with many solid names among its backers and contributors. It's being actively developed and improved. There are several broker implementations, which are interchangeable in many cases (not all however... yet). Specifically, RabbitMQ (www.rabbitmq.com) is AMQP-compliant broker. Written in erlang and due to this fact, it comes with good clustering support out of the box. Client libs for many languages are available (including python and ruby).

    December 31, 1999 | Unregistered CommenterDmitriy

    Does flickr advertise the fact they use Spread? I thought the license required you to make it obvious you do. From http://spread.org/license/license.html:

    3. All advertising materials (including web pages) mentioning features or use of this software, or software that uses this software, must display the following acknowledgment: "This product uses software developed by Spread Concepts LLC for use in the Spread toolkit. For more information about Spread see http://www.spread.org"

    A quick scan of flickr.com didn't show anything.

    December 31, 1999 | Unregistered CommenterGary Richardson

    One of my friends is a network administrator at a large Web 2.0 company that uses Spread internally, and he told me a number of horror stories about how its use of broadcast causes switch meltdowns. At the very least, anyone considering using it for a large-scale production environment should verify that their switching infrastructure can deal with high broadcast or multicast rates.

    December 31, 1999 | Unregistered CommenterFazal Majid

    > he told me a number of horror stories about how its use of broadcast causes switch meltdowns

    Did they have VLANs configured?

    December 31, 1999 | Unregistered CommenterTodd Hoff

    When I was at Hopkins, we worked out a separate license that doesn't require advertising if Spread deployed and used only by mod_log_spread and spreadlogd.

    http://www.spread.org/license/apache_log_license.txt">http://www.spread.org/license/apache_log_license.txt

    Many thanks to Yair for this contribution.

    December 31, 1999 | Unregistered CommenterTheo

    As I read it, apache_log_license.txt is *more* expansive than the Open Source Spread license in its requirements for acknowledging use of Spread. Notice it includes 'systems', not just software. Also, I can't be sure whether "software that uses this software" includes websites that use Spread behind the scenes. It's for this reason that I'm considering alternative products.

    December 31, 1999 | Unregistered CommenterAnonymous

    Also, I can't be sure whether "software that uses this software" includes websites that use Spread behind http://www.batteryfast.co.uk the scenes. It's for this reason that I'm considering alternative products.

    December 31, 1999 | Unregistered Commenterlaptop battery

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>