« Strategy: Devirtualize for More Vroom | Main | Handle 700 Percent More Requests Using Squid and APC Cache »
Tuesday
Jul282009

37signals Architecture

Update 7: Basecamp, now with more vroom. Basecamp application servers running Ruby code were upgraded and virtualization was removed. The result: A 66 % reduction in the response time while handling multiples of the traffic is beyond what I expected. They still use virtualization (Linux KVM), just less of it now.
Update 6: Things We’ve Learned at 37Signals. Themes: less is more; don't worry be happy.
Update 5: Nuts & Bolts: HAproxy . Nice explanation (post, screencast) by Mark Imbriaco of why HAProxy (load balancing proxy server) is their favorite (fast, efficient, graceful configuration, queues requests when Mongrels are busy) for spreading dynamic content between Apache web servers and Mongrel application servers.
Update 4: O'Rielly's Tim O'Brien interviews David Hansson, Rails creator and 37signals partner. Says BaseCamp scales horizontally on the application and web tier. Scales up for the database, using one "big ass" 128GB machine. Says: As technology moves on, hardware gets cheaper and cheaper. In my mind, you don't want to shard unless you positively have to, sort of a last resort approach.
Update 3: The need for speed: Making Basecamp faster. Pages now load twice as fast, cut CPU usage by a third and database time by about half. Results achieved by: Analysis, Caching, MySQL optimizations, Hardware upgrades.
Update 2: customer support is handled in real-time using Campfire.
Update: highly useful information on creating a customer billing system.


In the giving spirit of Christmas the folks at 37signals have shared a bit about how their system works. 37signals is most famous for loosing Ruby on Rails into the world and they've use RoR to make their very popular Basecamp, Highrise, Backpack, and Campfire products. RoR takes a lot of heat for being a performance dog, but 37signals seems to handle a lot of traffic with relatively normal sounding resources. This is just an initial data dump, they promise to add more details later. As they add more I'll update it here.

Site: http://www.37signals.com

Information Sources

  • Ask 37signals: Numbers?
  • Ask 37signals: How do you process credit cards?
  • Behind the scenes at 37signals: Support
  • Ask 37signals: Why did you restart Highrise?

    Platform

  • Ruby on Rails
  • Memcached
  • Xen
  • MySQL
  • S3 for image storage

    The Stats

  • 30 servers ranging from single processor file servers to 8 CPU application servers for about 100 CPUs and 200GB of RAM.
  • Plan to diagonally scale by reducing the number of servers to 16 for about 92 CPU cores (each significantly faster than what are used today) and 230 GB of combined RAM.
  • Xen virtualization will be used to improve system management.
  • Basecamp (web based project management)
    * 2,000,000 people with accounts
    * 1,340,000 projects
    * 13,200,000 to-do items
    * 9,200,000 messages
    * 12,200,000 comments
    * 5,500,000 time tracking entries
    * 4,000,000 milestones

  • Backpack (personal and small business information management)
    * Just under 1,000,000 pages
    * 6,800,000 to-do items
    * 1,500,000 notes
    * 829,000 photos
    * 370,000 files

  • Overall storage stats (Nov 2007)
    * 5.9 terabytes of customer-uploaded files
    * 888 GB files uploaded (900,000 requests)
    * 2 TB files downloaded (8,500,000 requests)

    The Architecture

  • Memcached caching is used and they are looking to add more. Yields impressive performance results.
  • URL helper methods are used rather than building the URLs by hand.
  • Standard ActiveRecord built queries are used, but for performance reasons they will also "dig in and use" find_by_sql when necessary.
  • They fix Rails when they run into performance problems. It pays to be king :-)
  • Amazon’s S3 is used for storage of files upload by users. Extremely happy with results.

    Credit Card Processing Process

  • Bill monthly. It makes credit card companies more comfortable because they won't be on the hook for a large chunk of change if your company goes out of business. Customers also like it better because it costs less up front and you don't need a contract. Just pay as long as you want the service.

  • Get a Merchant Account. One is needed to process credit cards. They use Chase Bank. Use someone you trust and later negotiate rates when you get enough volume that it matters.
  • Authorize.net is the gateway they use to process the credit card charge.
  • A custom built system handles the monthly billing. It runs each night and bills the appropriate people and records the result.
  • On success an invoice is sent via email.
  • On failure an explanation is sent to the customer.
  • If the card is declined three times the account is frozen until a valid card number is provided.
  • Error handling is critical because problems with charges are common. Freeze to fast is bad, freezing too slow is also bad.
  • All products are being converted to using a centralized billing service.
  • You need to be PCI DSS (Payment Card Industry Data Security Standard) compliant.
  • Use a gateway service that makes it so you don't have to store credit card numbers on your site. That makes your life easier because of the greater security. Some gateway services do have reoccurring billing so you don't have to do it yourself.

    Customer Support

  • Campfire is used for customer service. Campfire is a web-based group chat tool, password-protectable, with chatting, file sharing, image previewing, and decision making.
  • Issues discussed are used to drive code changes and the subversion commit is shown in the conversation. Seems to skip a bug tracking system, which would make it hard to manage bugs and features in any traditional sense, ie, you can't track subversion changes back to a bug and you can't report what features and bugs are in a release.
  • Support can solve problems by customers uploading images, sharing screens, sharing files, and chatting in real-time.
  • Developers are always on within Campfire addressing problems in real-time with the customers.

    Lessons Learned

  • Take a lesson from Amazon and build internal functions as services from the start. This make it easier to share them across all product lines and transparently upgrade features.
  • Don't store credit card numbers on your site. This greatly reduces your security risk.
  • Developers and customers should interact in real-time on a public forum. Customers get better service as developers handle issues as they come up in the normal flow of their development cycle. Several layers of the usual BS are removed. Developers learn what customers like and dislike which makes product development more agile. Customers can see the responsiveness of the company to customers by reading the interactions. This goes a long ways to give potential customers the confidence and the motivation to sign up.
  • Evolve your software by actual features needed by users instead of making up features someone might need someday. Otherwise you end up building something that nobody wants and won't work anyway.
  • Reader Comments (11)

    very nice! other companies could release some of their internal IT info/structure as well ,)

    December 31, 1999 | Unregistered CommenterEimantas

    You don't mention what kind of load balancer you use.

    Also, does Mongrel handle requests directly or do you have a proxy in front of it? (Perhaps doing load balancing?)

    December 31, 1999 | Unregistered CommenterAnonymous

    The architecture looks ok. But the numbers aren't impressive...
    I would expect that the number of their servers is due to low speed of ruby in generating content.
    And by the looks of it, they use a lot of S3 that means that they have unloaded a lot of file storage to Amazon infra.
    Good luck to them with expanding without expansion :)

    December 31, 1999 | Unregistered CommenterJAlexoid

    What kind of db servers are they using?

    Is it one db or one db per account.

    December 31, 1999 | Unregistered CommenterAnonymous

    Interesting...
    Do you know what each of their 30 servers do in details and how do they backup the huge amount of data in their databases?

    December 31, 1999 | Unregistered CommenterJason Smith

    Is it one db or one db per account.

    December 31, 1999 | Unregistered Commenteryoutube

    Well referenced list. I can agree completely with the lessons learned list - especially not storing critical info (aka CCs) on the server, and also focusing on internal functions from the start as they relate to expansion.

    December 31, 1999 | Unregistered Commenterweddings

    I have heard that ROR has a difficult time handling large amounts of queries, is this true?

    December 31, 1999 | Unregistered CommenterAmoils

    The old maxim"may you live in interesting times" certainly holds true for IT managers and professionals these days. The year 2008 was full of changes and challenges, and 2009 promises even more. To look ahead on the challenges emerging, we canvassed industry leaders across the IT landscape to get their views on what to expect in the year ahead. "The current financial crisis has underscored the need for improved data governance, better transparency and the need to more accurately calculate risk. Data governance will emerge as a required discipline for organizations, giving rise to greater trust," said Steven B. Adler, program director for IBM Data Governance Solutions and chairman of the IBM Data Governance Council."Ultimately, the ideal mix for larger enterprises may be to have a combination of both public and private clouds, which will allow enterprises control over their critical resources and private data while simultaneously enabling the scalability and efficiency inherent in cloud computing," said Brian Ott, vice president and CTO of systems and technology for Unisys. "Right now, cloud computing is much more geared toward consumer and retail applications, but we're starting to see a shift toward the enterprise cloud or, as some people call it, an intra-cloud.

    December 31, 1999 | Unregistered CommenterAppletosh

    Well referenced list. I can agree completely with the lessons learned list - especially not storing critical info (aka CCs) on the server, and also focusing on internal functions from the start as they relate to expansion.

    December 31, 1999 | Unregistered Commentermedyum

    Also, does Mongrel handle requests directly or do you have a proxy in front of it?

    October 18, 2012 | Unregistered CommenterJohn battellea

    PostPost a New Comment

    Enter your information below to add a new comment.
    Author Email (optional):
    Author URL (optional):
    Post:
     
    Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>