Thursday
Feb052009

Product: HAProxy - The Reliable, High Performance TCP/HTTP Load Balancer

Update: Load Balancing in Amazon EC2 with HAProxy. Grig Gheorghiu writes a nice post on HAProxy functionality and configuration: Emulating virtual servers, Logging, SSL, Load balancing algorithms, Session persistence with cookies, Server health checks, etc. Adapted From the website: HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications. It is particularly suited for web sites crawling under very high loads while needing persistence or Layer7 processing. Supporting tens of thousands of connections is clearly realistic with todays hardware. Its mode of operation makes its integration into existing architectures very easy and riskless, while still offering the possibility not to expose fragile web servers to the Net. Currently, two major versions are supported : * version 1.1 - maintains critical sites online since 200 The most stable and reliable, has reached years of uptime. Receives no new feature, dedicated to mission-critical usages only. * version 1.2 - opening the way to very high traffic sites The same as 1.1 with some new features such as poll/epoll support for very large number of sessions, IPv6 on the client side, application cookies, hot-reconfiguration, advanced dynamic load regulation, TCP keepalive, source hash, weighted load balancing, rbtree-based scheduler, and a nice Web status page. This code is still evolving but has significantly stabilized since 1.2.8. Unlike other free "cheap" load-balancing solutions, this product is only used by a few hundreds of people around the world, but those people run very big sites serving several millions hits and between several tens of gigabytes to several terabytes per day to hundreds of thousands of clients. They need 24x7 availability and have internal skills to risk to maintain a free software solution. Often, the solution is deployed for internal uses and I only know about it when they send me some positive feedback or when they ask for a missing feature ;-) According to many users HAProxy competes quite well with the likes of Pound and Ultramonkey.

Click to read more ...

Thursday
Feb052009

Beta testers wanted for ultra high-scalability/performance clustered object storage system designed for web content delivery

DataDirect Networks (www.ddn.com) is searching for beta testers for our exciting new object-based clustered storage system. Does this sound like you? * Need to store millions to hundreds of billions of files * Want to use one big file system but can't because no single file system scales big enough * Running out of inodes * Have to constantly tweak file systems to perform better * Need to replicate content to more than one data center across geographies * Have thumbnail images or other small files that wreak havoc on your file and storage systems * Constantly tweaking and engineering around performance and scalability limits * No storage system delivers enough IOPS to serve your content * Spend time load balancing the storage environment * Want a single, simple way to manage all this data If this sounds like you, please contact me at jgoldstein@ddn.com. DataDirect Networks is a 10-year old, well-established storage systems company specializing in Extreme Storage environments. We've deployed both the largest and the fastest storage/file systems on the planet - currently running at over 250GB/s. Our upcoming product is going to change the way storage is deployed for scalable web content and we're seeking testers who can throw their most challenging problems at our new system. It's time for something better and we're going to deliver it.

Click to read more ...

Tuesday
Feb032009

10 More Rules for Even Faster Websites

Update:How-To Minimize Load Time for Fast User Experiences. Shows how to analyze the bottlenecks preventing websites and blogs from loading quickly and how to resolve them. 80-90% of the end-user response time is spent on the frontend, so it makes sense to concentrate efforts there before heroically rewriting the backend. Take a shower before buying a Porsche, if you know what I mean. Steve Souders, author of High Performance Websites and Yslow, has ten more best practices to speed up your website:

  • Split the initial payload
  • Load scripts without blocking
  • Don’t scatter scripts
  • Split dominant content domains
  • Make static content cookie-free
  • Reduce cookie weight
  • Minify CSS
  • Optimize images
  • Use iframes sparingly
  • To www or not to www Sadly, according to String Theory, there are only 26.7 rules left, so get them while they're still in our dimension. Here are slides on the first few rules. Love the speeding dog slide. That's exactly what my dog looks like traveling down the road, head hanging out the window, joyfully battling the wind. Also see 20 New Rules for Faster Web Pages.

    Click to read more ...

  • Tuesday
    Feb032009

    Paper: Optimistic Replication

    To scale in the large you have to partition. Data has to be spread around, replicated, and kept consistent (keeping replicas sufficiently similar to one another despite operations being submitted independently at different sites). The result is a highly available, well performing, and scalable system. Partitioning is required, but it's a pain to do efficiently and correctly. Until Quantum teleportation becomes a reality how data is kept consistent across a bewildering number of failure scenarios is a key design decision. This excellent paper by Yasushi Saito and Marc Shapiro takes us on a wild ride (OK, maybe not so wild) of different approaches to achieving consistency. What's cool about this paper is they go over some real systems that we are familiar with and cover how they work: DNS (single-master, state-transfer), Usenet (multi-master), PDAs (multi-master, state-transfer, manual or application-specific conflict resolution), Bayou (multi-master, operation-transfer, epidemic propagation, application conflict resolution), CVS (multi-master operation-transfer, centralized, manual conflict resolution). The paper then goes on to explain in detail the different approaches to achieving consistency. Most of us will never have to write the central nervous system of an application like this, but knowing about the different approaches and tradesoffs is priceless. The abstract:

    Data replication is a key technology in distributed data sharing systems, enabling higher availability and performance. This paper surveys optimistic replication algorithms that allow replica contents to diverge in the short term, in order to support concurrent work practices and to tolerate failures in low-quality communication links. The importance of such techniques is increasing as collaboration through wide-area and mobile networks becomes popular. Optimistic replication techniques are different from traditional “pessimistic” ones. Instead of synchronous replica coordination, an optimistic algorithm propagates changes in the background, discovers conflicts after they happen and reaches agreement on the final contents incrementally. We explore the solution space for optimistic replication algorithms. This paper identifies key challenges facing optimistic replication systems — ordering operations, detecting and resolving conflicts, propagating changes efficiently, and bounding replica divergence—and provides a comprehensive survey of techniques developed for addressing these challenges.
    If you can't wait to know the ending, here's the summary of the paper:
    We summarize some of the lessons learned from our own experience and in reviewing the literature. Optimistic, asynchronous data replication is an appealing technique; it indeed improves networking flexibility and scalability. Some environments or application areas could simply not function without optimistic replication. However, optimistic replication also comes with a cost. The algorithmic complexity of ensuring eventual consistency can be high. Conflicts usually require application-specific resolution, and the lost update problem is ultimately unavoidable. Hence our recommendations: (1) Keep it simple. Traditional, pessimistic replication, with many off-the-shelf solutions, is perfectly adequate in small-scale, fully connected, reliable networking environments. Where pessimistic techniques are the cause of poor performance or lack of availability, or do not scale well, try single-master replication: it is simple, conflictfree, and scales well in practice. State transfer using Thomas’s write rule works well for many applications. Advanced techniques such as version vectors and operation transfer should be used only when you need flexibility and semantically rich conflict resolution. (2) Propagate operations quickly to avoid conflicts. While connected, propagate often and keep replicas in close synchronization. This will minimize divergence when disconnection does occur. (3) Exploit commutativity. Commutativity should be the default; design your system so that non-commutative operations are the uncommon case. For instance, whenever possible, partition data into small, independent objects. Within an object, use monotonic data structures such as an append-only log, a monotonically increasing counter, or a union-only set. When operations are dependent upon each other, represent the invariants explicitly.

    Related Articles

  • The End of an Architectural Era (It’s Time for a Complete Rewrite)
  • Big Table
  • Google's Paxos Made Live – An Engineering Perspective
  • Dynamo: Amazon’s Highly Available Key-value Store
  • Eventually Consistent - Revisited by Werner Vogels

    Click to read more ...

  • Sunday
    Feb012009

    More Chips Means Less Salsa

    Yes, I just got through watching the Superbowl so chips and salsa are on my mind and in my stomach. In recreational eating more chips requires downing more salsa. With mulitcore chips it turns out as cores go up salsa goes down, salsa obviously being a metaphor for speed. Sandia National Laboratories found in their simulations: a significant increase in speed going from two to four multicores, but an insignificant increase from four to eight multicores. Exceeding eight multicores causes a decrease in speed. Sixteen multicores perform barely as well as two, and after that, a steep decline is registered as more cores are added. The problem is the lack of memory bandwidth as well as contention between processors over the memory bus available to each processor. The implication for those following a diagonal scaling strategy is to work like heck to make your system fit within eight multicores. After that you'll need to consider some sort of partitioning strategy. What's interesting is the research on where the cutoff point will be.

    Click to read more ...

    Thursday
    Jan292009

    Event: MySQL Conference & Expo 2009

    The 5th annual MySQL Conference & Expo, co-presented by Sun Microsystems, MySQL and O'Reilly Media. Happening April 20-23, 2009 in Santa Clara, CA, at the Santa Clara Convention Center and Hyatt Regency Santa Clara, brings over 2,000 open source and database enthusiasts together to harness the power of MySQL and celebrate the huge MySQL ecosystem. All around the world, people just like you are innovating with MySQL—and MySQL is fueling the innovation engine by releasing new mission critical solutions to help you work smarter. This deeply technical conference brings all of that creativity, energy, and knowledge together in one place for four very full days. Early registration ends February 16, 2009. The largest gathering of MySQL developers, users, and DBAs worldwide, the event reflects MySQL's wide-ranging appeal and capabilities. The open atmosphere of the MySQL Conference & Expo helps IT professionals and community members launch and develop the best database applications, tools, and software. As companies of all sizes look for ways to remain competitive and manage costs, open source software and tools provide valuable and efficient solutions for the enterprise. The 2009 edition of the MySQL Conference & Expo will present strategies for businesses to not just survive, but thrive in a challenging economy. Through expert instruction, hands-on tutorials, and readily available MySQL developers, users at all levels gain the knowledge they need to rapidly build solid applications with MySQL that scale with the enterprise. New to the 2009 program will be MySQL Camp, a space where any and all participants can create an "unconference" within the larger event.

    Click to read more ...

    Tuesday
    Jan272009

    Video: Storage in the Cloud at Joyent

    Ben Rockwood of Joyent speaks on "Storage in the Cloud" at the first OpenSolaris Storage Summit. Ben is the Director of Systems at Joyent. The Joyent Accelerators are based on OpenSolaris and ZFS. He has deep experience with OpenSolaris in the Real World.

    Click to read more ...

    Monday
    Jan262009

    Paper: Scalability by Design - Coding for Systems With Large CPU Counts

    The multi-cores are coming and software designed for fewer cores usually doesn't work on more cores without substantial redesign. For a taste of the issues take a look at No new global mutexes! (and how to make the thread/connection pool work), which shows some of the difficulties of making MySQL perform on SMP servers. In this paper, Richard Smith, a –Staff Engineer at Sun, goes into some nice detail on multi-core issues. His take home lessons are:

  • Use fine-grained locking or lock-free strategy
  • Document the strategy, including correctness criteria (invariants)
  • Keep critical sections short
  • Profile the code at both light and heavy load
  • Collect HW performance counter data
  • Identify bottleneck resource (there's always at least one!)
  • Eliminate or ameliorate it

    Click to read more ...

  • Sunday
    Jan252009

    Where do I start?

    Hello, I'm developing a product I thought of. As a part of it, I'm trying to figure out the best architecture for the product. It's a server application, what is supposed to serve A LOT of users at a time. Think of an IM application, but not web style, more like ICQ\MSN applications... With web orientation. I've read all about Meebo, Facebook Chat & etc architectures. But I'm still not sure where to start on ICQ style (this is the first phase before I go totally web). Can you direct me to some information on load balancing 10 million users? ;-) I just don't know here to begin... Thanks, Amit

    Click to read more ...

    Thursday
    Jan222009

    Heterogeneous vs. Homogeneous System Architectures

    I follow a certain philosophy when developing system architectures. I assume that very few systems will ever exist in a consistent form for more than a short period of time. What constitutes a “short period of time” differs depending on the specifics of each system, but in an effort to quantify it, I generally find that it falls somewhere between a week and a month. The driving forces behind the need for an ever changing architecture are largely business requirement based. This is a side effect of the reality that software development, in most cases, is used as a supporting role within the business unit it serves. As business requirements (i.e. additional features, new products, etc.) pour forth, it is the developer’s job to evolve their software system to accommodate these requirements and provide a software based solution to whatever problems lay ahead. Given that many businesses can be identified as having the above characteristics, I can now begin to explain why I believe that Heterogeneous System Architectures hold a significant advantage over Homogeneous System Architectures, in many distributed system cases.

    Click to read more ...