Entries in MySQL (70)

Tuesday
Jul282009

37signals Architecture

Update 7: Basecamp, now with more vroom. Basecamp application servers running Ruby code were upgraded and virtualization was removed. The result: A 66 % reduction in the response time while handling multiples of the traffic is beyond what I expected. They still use virtualization (Linux KVM), just less of it now.
Update 6: Things We’ve Learned at 37Signals. Themes: less is more; don't worry be happy.
Update 5: Nuts & Bolts: HAproxy . Nice explanation (post, screencast) by Mark Imbriaco of why HAProxy (load balancing proxy server) is their favorite (fast, efficient, graceful configuration, queues requests when Mongrels are busy) for spreading dynamic content between Apache web servers and Mongrel application servers.
Update 4: O'Rielly's Tim O'Brien interviews David Hansson, Rails creator and 37signals partner. Says BaseCamp scales horizontally on the application and web tier. Scales up for the database, using one "big ass" 128GB machine. Says: As technology moves on, hardware gets cheaper and cheaper. In my mind, you don't want to shard unless you positively have to, sort of a last resort approach.
Update 3: The need for speed: Making Basecamp faster. Pages now load twice as fast, cut CPU usage by a third and database time by about half. Results achieved by: Analysis, Caching, MySQL optimizations, Hardware upgrades.
Update 2: customer support is handled in real-time using Campfire.
Update: highly useful information on creating a customer billing system.


In the giving spirit of Christmas the folks at 37signals have shared a bit about how their system works. 37signals is most famous for loosing Ruby on Rails into the world and they've use RoR to make their very popular Basecamp, Highrise, Backpack, and Campfire products. RoR takes a lot of heat for being a performance dog, but 37signals seems to handle a lot of traffic with relatively normal sounding resources. This is just an initial data dump, they promise to add more details later. As they add more I'll update it here.

Site: http://www.37signals.com

Information Sources

  • Ask 37signals: Numbers?
  • Ask 37signals: How do you process credit cards?
  • Behind the scenes at 37signals: Support
  • Ask 37signals: Why did you restart Highrise?

    Platform

  • Ruby on Rails
  • Memcached
  • Xen
  • MySQL
  • S3 for image storage

    The Stats

  • 30 servers ranging from single processor file servers to 8 CPU application servers for about 100 CPUs and 200GB of RAM.
  • Plan to diagonally scale by reducing the number of servers to 16 for about 92 CPU cores (each significantly faster than what are used today) and 230 GB of combined RAM.
  • Xen virtualization will be used to improve system management.
  • Basecamp (web based project management)
    * 2,000,000 people with accounts
    * 1,340,000 projects
    * 13,200,000 to-do items
    * 9,200,000 messages
    * 12,200,000 comments
    * 5,500,000 time tracking entries
    * 4,000,000 milestones

  • Backpack (personal and small business information management)
    * Just under 1,000,000 pages
    * 6,800,000 to-do items
    * 1,500,000 notes
    * 829,000 photos
    * 370,000 files

  • Overall storage stats (Nov 2007)
    * 5.9 terabytes of customer-uploaded files
    * 888 GB files uploaded (900,000 requests)
    * 2 TB files downloaded (8,500,000 requests)

    The Architecture

  • Memcached caching is used and they are looking to add more. Yields impressive performance results.
  • URL helper methods are used rather than building the URLs by hand.
  • Standard ActiveRecord built queries are used, but for performance reasons they will also "dig in and use" find_by_sql when necessary.
  • They fix Rails when they run into performance problems. It pays to be king :-)
  • Amazon’s S3 is used for storage of files upload by users. Extremely happy with results.

    Credit Card Processing Process

  • Bill monthly. It makes credit card companies more comfortable because they won't be on the hook for a large chunk of change if your company goes out of business. Customers also like it better because it costs less up front and you don't need a contract. Just pay as long as you want the service.

  • Get a Merchant Account. One is needed to process credit cards. They use Chase Bank. Use someone you trust and later negotiate rates when you get enough volume that it matters.
  • Authorize.net is the gateway they use to process the credit card charge.
  • A custom built system handles the monthly billing. It runs each night and bills the appropriate people and records the result.
  • On success an invoice is sent via email.
  • On failure an explanation is sent to the customer.
  • If the card is declined three times the account is frozen until a valid card number is provided.
  • Error handling is critical because problems with charges are common. Freeze to fast is bad, freezing too slow is also bad.
  • All products are being converted to using a centralized billing service.
  • You need to be PCI DSS (Payment Card Industry Data Security Standard) compliant.
  • Use a gateway service that makes it so you don't have to store credit card numbers on your site. That makes your life easier because of the greater security. Some gateway services do have reoccurring billing so you don't have to do it yourself.

    Customer Support

  • Campfire is used for customer service. Campfire is a web-based group chat tool, password-protectable, with chatting, file sharing, image previewing, and decision making.
  • Issues discussed are used to drive code changes and the subversion commit is shown in the conversation. Seems to skip a bug tracking system, which would make it hard to manage bugs and features in any traditional sense, ie, you can't track subversion changes back to a bug and you can't report what features and bugs are in a release.
  • Support can solve problems by customers uploading images, sharing screens, sharing files, and chatting in real-time.
  • Developers are always on within Campfire addressing problems in real-time with the customers.

    Lessons Learned

  • Take a lesson from Amazon and build internal functions as services from the start. This make it easier to share them across all product lines and transparently upgrade features.
  • Don't store credit card numbers on your site. This greatly reduces your security risk.
  • Developers and customers should interact in real-time on a public forum. Customers get better service as developers handle issues as they come up in the normal flow of their development cycle. Several layers of the usual BS are removed. Developers learn what customers like and dislike which makes product development more agile. Customers can see the responsiveness of the company to customers by reading the interactions. This goes a long ways to give potential customers the confidence and the motivation to sign up.
  • Evolve your software by actual features needed by users instead of making up features someone might need someday. Otherwise you end up building something that nobody wants and won't work anyway.
  • Wednesday
    Apr292009

    Presentations: MySQL Conference & Expo 2009

    The Presentations of the MySQL Conference & Expo 2009 held April 20-23 in Santa Clara is available on the above link.

    They include:

    • Beginner's Guide to Website Performance with MySQL and memcached by Adam Donnison

    • Calpont: Open Source Columnar Storage Engine for Scalable MySQL DW by Jim Tommaney

    • Creating Quick and Powerful Web Applications with MySQL, GlassFish, and NetBeans by Arun Gupta

    • Deep-inspecting MySQL with DTrace by Domas Mituzas

    • Distributed Innodb Caching with memcached by Matthew Yonkovit and Yves Trudeau

    • Improving Performance by Running MySQL Multiple Times by MC Brown

    • Introduction to Using DTrace with MySQL by Vince Carbone

    • MySQL Cluster 7.0 - New Features by Johan Andersson

    • Optimizing MySQL Performance with ZFS by Allan Packer

    • SAN Performance on a Internal Disk Budget: The Coming Solid State Disk Revolution by Matthew Yonkovit

    • This is Not a Web App: The Evolution of a MySQL Deployment at Google by Mark Callaghan
    Friday
    Apr102009

    Facebook's Aditya giving presentation on Facebook Architecture

    Facebook's engg. director aditya talks about facebook architecture. How they use mysql, php and memcache. How they have modified the above to suit their requirements.

    Click to read more ...

    Saturday
    Apr042009

    Digg Architecture

    Update 4:: Introducing Digg’s IDDB Infrastructure by Joe Stump. IDDB is a way to partition both indexes (e.g. integer sequences and unique character indexes) and actual tables across multiple storage servers (MySQL and MemcacheDB are currently supported with more to follow). Update 3:: Scaling Digg and Other Web Applications. Update 2:: How Digg Works and How Digg Really Works (wear ear plugs). Brought to you straight from Digg's blog. A very succinct explanation of the major elements of the Digg architecture while tracing a request through the system. I've updated this profile with the new information. Update: Digg now receives 230 million plus page views per month and 26 million unique visitors - traffic that necessitated major internal upgrades. Traffic generated by Digg's over 22 million famously info-hungry users and 230 million page views can crash an unsuspecting website head-on into its CPU, memory, and bandwidth limits. How does Digg handle billions of requests a month? Site: http://digg.com

    Information Sources

  • How Digg Works by Digg
  • How Digg.com uses the LAMP stack to scale upward
  • Digg PHP's Scalability and Performance

    Platform

  • MySQL
  • Linux
  • PHP
  • Lucene
  • Python
  • APC PHP Accelerator
  • MCache
  • Gearman - job scheduling system
  • MogileFS - open source distributed filesystem
  • Apache
  • Memcached

    The Stats

  • Started in late 2004 with a single Linux server running Apache 1.3, PHP 4, and MySQL. 4.0 using the default MyISAM storage engine
  • Over 22 million users.
  • 230 million plus page views per month
  • 26 million unique visitors per month
  • Several billion page views per month
  • None of the scaling challenges faced had anything to do with PHP. The biggest issues faced were database related.
  • Dozens of web servers.
  • Dozens of DB servers.
  • Six specialized graph database servers to run the Recommendation Engine.
  • Six to ten machines that serve files from MogileFS.

    What's Inside

  • Specialized load balancer appliances monitor the application servers, handle failover, constantly adjust the cluster according to health, balance incoming requests and caching JavaScript, CSS and images. If you don't have the fancy load balancers take a look at Linux Virtual Server and Squid as a replacement.
  • Requests are passed to the Application Server cluster. Application servers consist of: Apache+PHP, Memcached, Gearman and other daemons. They are responsible for making coordinating access to different services (DB, MogileFS, etc) and creating the response sent to the browser.
  • Uses a MySQL master-slave setup. - Four master databases are partitioned by functionality: promotion, profiles, comments, main. Many slave databases hang off each master. - Writes go to the masters and reads go to the slaves. - Transaction-heavy servers use the InnoDB storage engine. - OLAP-heavy servers use the MyISAM storage engine. - They did not notice a performance degradation moving from MySQL 4.1 to version 5. - The schema is denormalized more than "your average database design." - Sharding is used to break the database into several smaller ones.
  • Digg's usage pattern makes it easier for them to scale. Most people just view the front page and leave. Thus 98% of Digg's database accesses are reads. With this balance of operations they don't have to worry about the complex work of architecting for writes, which makes it a lot easier for them to scale.
  • They had problems with their storage system telling them writes were on disk when they really weren't. Controllers do this to improve the appearance of their performance. But what it does is leave a giant data integrity whole in failure scenarios. This is really a pretty common problem and can be hard to fix, depending on your hardware setup.
  • To lighten their database load they used the APC PHP accelerator MCache.
  • Memcached is used for caching and memcached servers seemed to be spread across their database and application servers. A specialized daemon monitors connections and kills connections that have been open too long.
  • You can configure PHP not parse and compile on each load using a combination of Apache 2’s worker threads, FastCGI, and a PHP accelerator. On a page's first load the PHP code is compiles so any subsequent page loads are very fast.
  • MogileFS, a distributed file system, serves story icons, user icons, and stores copies of each story’s source. A distributed file system spreads and replicates files across a lot of disks which supports fast and scalable file access.
  • A specialized Recommendation Engine service was built to act as their distributed graph database. Relational databases are not well structured for generating recommendations so a separate service was created. LinkedIn did something similar for their graph.

    Lessons Learned

  • The number of machines isn't as important what the pieces are and how they fit together.
  • Don't treat the database as a hammer. Recommendations didn't fit will with the relational model so they made a specialized service.
  • Tune MySQL through your database engine selection. Use InnoDB when you need transactions and MyISAM when you don't. For example, transactional tables on the master can use MyISAM for read-only slaves.
  • At some point in their growth curve they were unable to grow by adding RAM so had to grow through architecture.
  • People often complain Digg is slow. This is perhaps due to their large javascript libraries rather than their backend architecture.
  • One way they scale is by being careful of which application they deploy on their system. They are careful not to release applications which use too much CPU. Clearly Digg has a pretty standard LAMP architecture, but I thought this was an interesting point. Engineers often have a bunch of cool features they want to release, but those features can kill an infrastructure if that infrastructure doesn't grow along with the features. So push back until your system can handle the new features. This goes to capacity planning, something the Flickr emphasizes in their scaling process.
  • You have to wonder if by limiting new features to match their infrastructure might Digg lose ground to other faster moving social bookmarking services? Perhaps if the infrastructure was more easily scaled they could add features faster which would help them compete better? On the other hand, just adding features because you can doesn't make a lot of sense either.
  • The data layer is where most scaling and performance problems are to be found and these are language specific. You'll hit them using Java, PHP, Ruby, or insert your favorite language here.

    Related Articles

    * LinkedIn Architecture * Live Journal Architecture * Flickr Architecture * An Unorthodox Approach to Database Design : The Coming of the Shard
  • Ebay Architecture

    Click to read more ...

  • Thursday
    Jan292009

    Event: MySQL Conference & Expo 2009

    The 5th annual MySQL Conference & Expo, co-presented by Sun Microsystems, MySQL and O'Reilly Media. Happening April 20-23, 2009 in Santa Clara, CA, at the Santa Clara Convention Center and Hyatt Regency Santa Clara, brings over 2,000 open source and database enthusiasts together to harness the power of MySQL and celebrate the huge MySQL ecosystem. All around the world, people just like you are innovating with MySQL—and MySQL is fueling the innovation engine by releasing new mission critical solutions to help you work smarter. This deeply technical conference brings all of that creativity, energy, and knowledge together in one place for four very full days. Early registration ends February 16, 2009. The largest gathering of MySQL developers, users, and DBAs worldwide, the event reflects MySQL's wide-ranging appeal and capabilities. The open atmosphere of the MySQL Conference & Expo helps IT professionals and community members launch and develop the best database applications, tools, and software. As companies of all sizes look for ways to remain competitive and manage costs, open source software and tools provide valuable and efficient solutions for the enterprise. The 2009 edition of the MySQL Conference & Expo will present strategies for businesses to not just survive, but thrive in a challenging economy. Through expert instruction, hands-on tutorials, and readily available MySQL developers, users at all levels gain the knowledge they need to rapidly build solid applications with MySQL that scale with the enterprise. New to the 2009 program will be MySQL Camp, a space where any and all participants can create an "unconference" within the larger event.

    Click to read more ...

    Sunday
    Dec282008

    How to Organize a Database Table’s Keys for Scalability

    The key (no pun intended) to understanding how to organize your dataset’s data is to think of each shard not as an individual database, but as one large singular database. Just as in a normal single server database setup where you have a unique key for each row within a table, each row key within each individual shard must be unique to the whole dataset partitioned across all shards. There are a few different ways we can accomplish uniqueness of row keys across a shard cluster. Each has its pro’s and con’s and the one chosen should be specific to the problems you’re trying to solve.

    Click to read more ...

    Saturday
    Dec202008

    Second Life Architecture - The Grid

    Update:Presentation: Second Life’s Architecture. Ian Wilkes, VP of Systems Engineering, describes the architecture used by the popular game named Second Life. Ian presents how the architecture was at its debut and how it evolved over years as users and features have been added. Second Life is a 3-D virtual world created by its Residents. Virtual Worlds are expected to be more and more popular on the internet so their architecture might be of interest. Especially important is the appearance of open virtual worlds or metaverses. What happens when video games meet Web 2.0? What happens is the metaverse.

    Information Sources

    Platform

    • MySQL
    • Apache
    • Squid
    • Python
    • C++
    • Mono
    • Debian

    What's Inside?

    The Stats

    • ~1M active users
    • ~95M user hours per quarter
    • ~70K peak concurrent users (40% annual growth)
    • ~12Gbit/sec aggregate bandwidth (in 2007)

    Staff (in 2006)

    • 70 FTE + 20 part time
    "about 22 are programmers working on SL itself. At any one time probably 1/3 of the team is on infrastructure, 1/3 is on new features and 1/3 is on various maintenance tasks (bug fixes, general stability and speed improvements) or improvements to existing features. But it varies a lot."

    Software

    Client/Viewer
    • Open Source client
    • Render the Virtual World
    • Handles user interaction
    • Handles locations of objects
    • Gets velocities and does simple physics to keep track of what is moving where
    • No collision detection
    Simulator (Sim) Each geographic area (256x256 meter region) in Second Life runs on a single instantiation of server software, called a simulator or "sim." And each sim runs on a separate core of a server. The Simulator is the primary SL C++ server process which runs on most servers. As the viewer moves through the world it is handled off from one simulator to another.
    • Runs Havok 4 physics engine
    • Runs at 45 frames/sec. If it can't keep up, it will attempt time dialation without reducing frame rate.
    • Handles storing object state, land parcel state, and terrain height-map state
    • Keeps track of where everything is and does collision detection
    • Sends locations of stuff to viewer
    • Transmits image data in a prioritized queue
    • Sends updates to viewers only when needed (only when collision occurs or other changes in direction, velocity etc.)
    • Runs Linden Scripting Language (LSL) scripts
    • Scripting has been recently upgraded to the much faster Mono scripting engine
    • Handles chat and instant messages
      • Asset Server
        • One big clustered filesystem ~100TB
        • Stores asset data such as textures.
        MySQL database Second Life has started with One Database, and have subsequently been forced into clustering. They use a ton of MySQL databases running on Debian machines to handle lots of centralized services. Rather than attempt to build the one, impossibly large database – all hail the Central Database – or one, impossibly large central cluster – all hail the Cluster – Linden Lab instead adopted a divide and conquer strategy based around data partitioning. The good thing is that UUIDs– 128-bit unique identifiers – are associated with most things in Second Life, so partitioning is generally doable. Backbone Linden Lab has converted much of their backend architecture away from custom C++/messaging into web services. Certain services has been moved off of MySQL – or cached (Squid) between the queries and MySQL. Presence, in particular Agent Presence, ie are you online and where are you on the grid, is a particularly tricky kind of query to partition, so there is now a Python service running on the SL grid called Backbone. It proved to be easier to scale, develop and maintain than many of their older technologies, and as a result, it plays an increasingly important role in the Second Life platform as Linden Lab migrates their legacy code to web services. Two main components of the backbone are open source:
        • Eventlet is a networking library written in Python. It achieves high scalability by using non-blocking io while at the same time retaining high programmer usability by using coroutines to make the non-blocking io operations appear blocking at the source code level.
        • Mulib is a REST web service framework built on top of eventlet

        Hardware

        • 2000+ Servers in 2007
        • ~6000 Servers in early 2008
        • Plans to upgrade to ~10000 (?)
        • 4 sims per machine, for both class 4 and class 5
        • Used all-AMD for years, but are moving from the Opteron 270 to the Intel Xeon 5148
        • The upgrade to "class 5" servers doubled the RAM per machine from 2GB to 4GB and moved to a faster SATA disk
        • Class 1 - 4 are on 100Mb with 1Gb uplinks to the core. Class 5 is on pure 1Gb
        Do you have more details?

        Click to read more ...

    Sunday
    Dec142008

    Scaling MySQL on a 256-way T5440 server using Solaris ZFS and Java 1.7

    How to scale MySQL on a 32 core system with 256 threads? Diagonal scalability in a box. An impressive benchmark that achieved more than 79,000 SQL queries per second on a single 4 RU server! Is this real? If so what is the role of good old horizontal scalability? The goals of the benchmark:

    1. Reach a high throughput of SQL queries on a 256-way Sun SPARC Enterprise T5440
    2. Do it 21st century style i.e. with MySQL and ZFS , not 20th century style i.e with OraSybInf... and VxFS
    3. Do it with minimal tuning i.e as close as possible as out-of-the-box

    Click to read more ...

    Monday
    Dec012008

    Deploying MySQL Database in Solaris Cluster Environments

    MySQL™ database, an open source database, delivers high performance and reliability while keeping costs low by eliminating licensing fees. The Solaris™ Cluster product is an integrated hardware and software environment that can be used to create highly-available data services. This article explains how to deploy the MySQL database in a Solaris Cluster environment. The article addresses the following topics: * "Advantages of Deploying MySQL Database with Solaris Cluster" on page 1 discusses the benefits provided by a Solaris Cluster deployment of the MySQL database. * "Overview of Solaris Cluster" on page 2 provides a high-level description of the hardware and software components of the Solaris Cluster. * "Installation and Configuration" on page 8 explains the procedure for deploying the MySQL database on a Solaris Cluster. This article assumes that readers have a basic understanding of Solaris Cluster and MySQL database installation and administration.

    Click to read more ...

    Monday
    Dec012008

    MySQL Database Scale-out and Replication for High Growth Businesses

    It is widely recognized that MySQL is the most popular database software in the world. Since its inception in 1995, there have been 11 million product installations around the world in a wide variety of markets. There are more installations of MySQL in use today than any other database architecture. From startup companies hoping to be the next Web2.0 poster child to large global enterprises, the MySQL database architecture has proven to be flexible, extendable, scalable, and more than capable of filling high-capacity database roles in very different venues.

    Click to read more ...