Tuesday
Mar232010

Digg: 4000% Performance Increase by Sorting in PHP Rather than MySQL

O'Reilly Radar's James Turner conducted a very informative interview with Joe Stump, current CTO of SimpleGeo and former lead architect at Digg, in which Joe makes some of his usually insightful comments on his experience using Cassandra vs MySQL. As Digg started out with a MySQL oriented architecture and has recently been moving full speed to Cassandra, his observations on some of their lessons learned and the motivation for the move are especially valuable. Here are some of the key takeaways you find useful:

Click to read more ...

Monday
Mar222010

7 Secrets to Successfully Scaling with Scalr (on Amazon) by Sebastian Stadil

This is a part interview part guest with Sebastian Stadil, founder of Scalr, a cheaper open-source version of RightScale. Scalr takes care of all the web site infrastructure bits to on Amazon (and other clouds) so you don’t have to.

I first met Sebastian at one of the original Silicon Valley Cloud Computing Group meetups, a group which he founded. The meetings started in the tiny offices of Intalio where Sebastian was working with this new fangled Amazon thing to create an auto-scaling server farm on EC2. I remember distinctly how Sebastian met everyone at the door with a handshake and a smile, making us all feel welcome. Later I took one of the classes he created on how to use AWS. I guess he figured all this cloud stuff was going somewhere and decided to start Scalr.

My only regret about this post is that the name Amazon does not begin with the letter ‘S’, that would have made for an epic title.

Click to read more ...

Friday
Mar192010

Hot Scalability Links for March 19, 2010

  1. The Changelog Episode 0.1.8 - NoSQL Smackdown! This podcast was recorded at SXSW and features some energetic trash talking by: Stu Hood from Cassandra, Jan Lehnardt from CouchDB, Wynn Netherland from The Changelog, subbing for MongoDB, Werner Vogels CTO at Amazon. It's fun hearing these guys step out of their sober advocacy roles and let loose a little with why they are great and the other products suck, hard.
  2. Algorithmic Graph Theory . It's FREE! A GNU-FDL book on algorithmic graph theory by David Joyner, Minh Van Nguyen, and Nathann Cohen. This is an introductory book on algorithmic graph theory.
  3. HBase vs Cassandra: why we moved by Dominic Williams.
  4. Benchmarking Cloud Serving Systems with YCSB by lots of people from Yahoo! Research. We present the Yahoo! Cloud Serving Benchmark (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems. We define a core set of benchmarks and report re- sults for four widely used systems: Cassandra, HBase, Yahoo!’s PNUTS, and a simple sharded MySQL implementation.
  5. All recordings from NoSQL Live Boston now online!. It's almost like you were there.

Click to read more ...

Tuesday
Mar162010

1 Billion Reasons Why Adobe Chose HBase 

Cosmin Lehene wrote two excellent articles on Adobe's experiences with HBase: Why we’re using HBase: Part 1 and Why we’re using HBase: Part 2. Adobe needed a generic, real-time, structured data storage and processing system that could handle any data volume, with access times under 50ms, with no downtime and no data loss. The article goes into great detail about their experiences with HBase and their evaluation process, providing a "well reasoned impartial use case from a commercial user". It talks about failure handling, availability, write performance, read performance, random reads, sequential scans, and consistency. 

One of the knocks against HBase has been it's complexity, as it has many parts that need installation and configuration. All is not lost according to the Adobe team:

HBase is more complex than other systems (you need Hadoop, Zookeeper, cluster machines have multiple roles). We believe that for HBase, this is not accidental complexity and that the argument that “HBase is not a good choice because it is complex” is irrelevant. The advantages far outweigh the problems. Relying on decoupled components plays nice with the Unix philosophy: do one thing and do it well. Distributed storage is delegated to HDFS, so is distributed processing, cluster state goes to Zookeeper. All these systems are developed and tested separately, and are good at what they do. More than that, this allows you to scale your cluster on separate vectors. This is not optimal, but it allows for incremental investment in either spindles, CPU or RAM. You don’t have to add them all at the same time.

Highly recommended, especially if you need some sort of balance to the recent gush of Cassandra articles. 

Tuesday
Mar162010

Justin.tv's Live Video Broadcasting Architecture

The future is live. The future is real-time. The future is now. That's the hype anyway. And as it has a habit of doing, the hype is slowly becoming reality. We are seeing live searches, live tweets, live location, live reality augmentation, live crab (fresh and local), and live event publishing. One of the most challenging of all live technologies is that of live video broadcasting. Imagine a world in which everyone becomes a broadcaster and a consumer of video streams, all in real-time (< 250 msec latency), all so you can talk and interact directly without feeling like you are in the middle of a time shift war. The resources and the engineering needed to make this happened must be substantial. How do you do that?

To find out I talked to Kyle Vogt, Justin.tv Founder and VP of Engineering. Justin.tv certainly has the numbers. Their 30 million unique monthly visitors even outshine YouTube in the video upload game, reportedly uploading nearly 30 hours per minute of video compared to YouTube's 23. I asked for an interview after listening to an interview with Justin Kan, another Founder of the eponymously named Justin.tv. Justin talked about how live video was fundamentally different than YouTube's batch video approach, where all the video is stored on disk and replayed later on demand. Live video can't be made by pushing video faster, it takes a completely differently architecture. Since the YouTube Architecture article is the most popular article ever on this site, I thought people might also enjoy learning about live side of the video world. Kyle was unbelievably generous with his time and insight into how Justin.tv makes all this live video magic happen, going way beyond the call, providing a tremendous number of juicy details. Anyone building a system can learn something from how they run their business. I can't thank Kyle enough for putting up with my never ending prodding.

Click to read more ...

Thursday
Mar112010

What would you like to ask Justin.tv?

It looks like I'll have the chance to interview someone tomorrow from Justin.tv about their architecture, which is pretty exciting given their leadership role in live broadcasting. They get 30 million uniques a month, can handle 1 million simultaneous broadcasts and hope to grow another magnitude in the near future. That must take some doing.

Here's your opportunity, especially if you think my questions suck, to ask your own sucky questions :-) What would you like to know about Justin.tv?

Wednesday
Mar102010

Saying Yes to NoSQL; Going Steady with Cassandra at Digg

The last six months have been exciting for Digg's engineering team. We're working on a soup-to-nuts rewrite. Not only are we rewriting all our application code, but we're also rolling out a new client and server architecture. And if that doesn't sound like a big enough challenge, we're replacing most of our infrastructure components and moving away from LAMP.

Perhaps our most significant infrastructure change is abandoning MySQL in favor of a NoSQL alternative. To someone like me who's been building systems almost exclusively on relational databases for almost 20 years, this feels like a bold move.

What's Wrong with MySQL?

Our primary motivation for moving away from MySQL is the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight. This growth has forced us into horizontal and vertical partitioning strategies that have eliminated most of the value of a relational database, while still incurring all the overhead.

Relational database technology can be a blunt instrument and we're motivated to find a tool that matches our specific needs closely. Our domain area, news, doesn't exact strict consistency requirements, so (according to Brewer's theorem) relaxing this allows gains in availability and partition tolerance (i.e. operations completing, even in degraded system states). We're confident that our engineers can implement application level consistency controls much more efficiently than MySQL does generically.

As our system grows, it's important for us to span multiple data centers for redundancy and network performance and to add capacity or replace failed nodes with no downtime. We plan to continue using commodity hardware, and to continue assuming that it will fail regularly. All of this is increasingly difficult with MySQL.

 

Wednesday
Mar102010

How FarmVille Scales - The Follow-up

Several readers had follow-up questions in response to How FarmVille Scales to Harvest 75 Million Players a Month. Here are Luke's response to those questions (and a few of mine).

How does social networking makes things easier or harder?

The primary interesting aspect of social networking games is how you wind up with a graph of connected users who need to be access each other's data on a frequent basis. This makes the overall dataset difficult if not impossible to partition.

What are examples of the Facebook calls you try to avoid and how they impact game play?

We can make a call for facebook friend data to retrieve information about your friends playing the game. Normally, we show a friend ladder at the bottom of the game that shows friend information, including name and facebook photo. 

Can you say where your cache is, what form it takes, and how much cached there is? Do you have a peering relationship with Facebook, as one might expect at that bandwidth?

Click to read more ...

Tuesday
Mar092010

Sponsored Post: Job Openings - Squarespace

Squarespace Looking for Full-time Scaling Expert

Interested in helping a cutting-edge, high-growth startup scale? Squarespace, which was profiled here last year in Squarespace Architecture - A Grid Handles Hundreds of Millions of Requests a Month and also hosts this blog, is currently in the market for a crack scalability engineer to help build out its cloud infrastructure. Squarespace is very excited about finding a full-time scaling expert.

Interested applicants should go to http://www.squarespace.com/jobs-software-engineer for more information.



If you would like to advertise your critical, hard to fill job opeinings on HighScalability, please contact us and we'll get it setup for you.

Tuesday
Mar092010

Applications as Virtual States

This is an excerpt from my article Building Super Scalable Systems: Blade Runner Meets Autonomic Computing in the Ambient Cloud.

As I was writing an article on the architecture of the Storm Botnet, I couldn't help but notice the deep similarity of how Storm works and changes we're seeing in the evolution of political systems. In particular, the rise of the virtual-state. As crazy as this may sound, I think this is also the direction applications will need follow to survive in a complex world of billions of compute devices.

You may have already heard of virtual corporations. Virtual corporations are companies with limited office space, a distributed workforce, and production facilities located wherever it is profitable to locate them. The idea is to stay lean and compete using the rapid development and introduction of new products into high value-added markets. If you spot a market opportunity with a small time window, building your own factories and hiring and engineering team simply isn't an option. Building factories is a bit old fashioned and is left to the select few. These days you get an idea for a product and contract out everything else you possibly can. It doesn't really matter where you are located or where any of your partners are located. If part of your product requires a specialized microprocessor, for example, you'll contract out the R&D and the design. The manufacture will be contracted out to a virtual fab, then the chip will be sent to a contract manufacturing service for integration. Look ma, no hands.

Futurists say land doesn't matter anymore. Nations don't matter anymore. Entire relationships are abstractly represented by flows of money, contracts, information, and products between all these different agents. Interestingly enough, what technology is the absolute master of managing flows? Applications! But we are getting ahead of ourselves here.

Click to read more ...