« PRISM: The Amazingly Low Cost of ­Using BigData to Know More About You in Under a Minute | Main | Paper: XORing Elephants: Novel Erasure Codes for Big Data »
Friday
Jun282013

Stuff The Internet Says On Scalability For June 28, 2013

Hey, it's HighScalability time:

  • Who am I? I have 50 petabytes of data stored in Hadoop and Teradata, 400 million items for sale, 250 million queries a day, 100,000 pages served per second, 112 million active users, $75 billions sold in 2012...If you guessed eBay then you've won the auction.
  • Quotable Quotes:
    • Controlled Experiments at Large Scale: Bing found that every 100ms faster they deliver search result pages yields 0.6% more in revenue
    • Luis Bettencourt: A city is first and foremost a social reactor. It works like a star, attracting people and accelerating social interaction and social outputs in a way that is analogous to how stars compress matter and burn brighter and faster the bigger they are.
    • @nntaleb: unless you understand that fat tails come from concentration of errors, you should not discuss probability & risk 
  • Need to make Hadoop faster? Hadoop + GPU: Boost performance of your big data project by 50x-200x? Or there's Spark, which uses in-memory techniques to run upto 100x faster than Hadoop. Also, Spark: Open Source Superstar Rewrites Future of Big Data

  • Human, you aren't so special after all. Even plants can do math: During the night, mechanisms inside the leaf measure the size of the starch store and estimate the length of time until dawn. Information about time comes from an internal clock, similar to our own body clock. The size of the starch store is then divided by the length of time until dawn to set the correct rate of starch consumption, so that, by dawn, around 95% of starch is used up.

  • Filmgrain talks about their architecture. All Redis all the time. Redis does quad duty as a cache, queue pub/sub, and database. Each component is isolated through Redis. 

  • When key-value and relational are not enough, Facebook went graph with TAO, a custom distributed service designed around objects and associations, used for many features including likes, pages, and events. It serves thousands of data types and handles over a billion read requests and millions of write requests every second. Most of the complexity is kept in the client, it doesn't perform typical graph algorithm operations on the server. This simplicity "helps product engineers find an optimal division of labor between application servers, data store servers, and the network connecting them." It's geographically distributed, eventually consistent, organized as a tree, single master per shard.

  • Will the amazing capabilities of the seemingly simple slime mould never end? Now it could make memristors for biocomputers:  Slime mould can be used to perform all the logic functions that conventional computer hardware components can do.

  • In the concurrent connection game,  MigratoryData says the can scale up to 12 million concurrent users from a single Dell PowerEdge R610 server

  • To make an omelette you have to break a few eggs. Same for software? The Antifragile Organization is an ACM article talking about how Netflix "embracing failure to improve resilience and maximize availability." Yes, you'll find your favorite Planet of the Apes characters, but there's also the idea of an antifragile organization: every engineer is an operator; each failure is an opportunity to learn; don't point the fickle finger of blame.

  • Azure now supports autoscaling. The number of instances is settable via the console as are threshold CPU load levels. Now all you need to do is make your app horizontally scalable.

  • Disks aint dead yet. StorageMojo: While someone, somewhere, will undoubtedly invest in an all-flash data center, very few businesses will go in that direction in the next 10 years. New storage systems that stress commodity hardware and scale out architectures can be looked at as horizontal layers rather than vertical stovepipes.

  • Slides and Videos for RICON East are now available. A lot of good stuff. I predict you might like Automatically Scalable Computation by Dr. Seltzer. Using prediction in programs to make the best use of a massive core future. Combines parallelization with machine learning pixie dust. Also, Optimizing LevelDB for Performance and Scale

  • It doesn't matter what your automation tool of choice is, Salt: Like Puppet, Except It Doesn’t Suck is a great discussion of the different angles developers take on the problem. No real winner. Just lots of good options. Given we had to build all this stuff from scratch not that long ago, that's a good thing.

  • Chaos is always just one bug away. A bug on the iPhone caused open TCP connections not to close. Very easy to do. Something probably every network programmer has done at one time. The effect when the bug is on zillions of phones? A DDoS attack on YouTube

  • Tomek Wójcik tackles tag design with Fun with PostgreSQL: Tagging blog posts. Arrays make so many things easier.

  • If you've ever wondered how Google machine learns on pictures then you might like Fast, Accurate Detection of 100,000 Object Classes on a Single Machine: We demonstrate the advantages of our approach by scaling object detection from the current state of the art involving several hundred or at most a few thousand of object categories to 100,000 categories requiring what would amount to more than a million convolutions. Moreover, our demonstration was carried out on a single commodity computer requiring only a few seconds for each image. The basic technology is used in several pieces of Google infrastructure and can be applied to problems outside of computer vision such as auditory signal processing. 

  • Beyond Silicon: Transistors without Semiconductors: Imagine that the nanotubes are a river, with an electrode on each bank. Now imagine some very tiny stepping stones across the river. The electrons hopped between the gold stepping stones. The stones are so small, you can only get one electron on the stone at a time. Every electron is passing the same way, so the device is always stable.

  • If Mobile is the Future then we should probably learn how to design for mobile. eBay found 90% of customer support calls were because they didn't have a "forgot password" line in their login. In 2011 3/4ths of shopping carts were abandoned. Expedia removed the Company field from a cart and the result was 12 million more in sales over night. People who use Amazon Prime go from $400 to $900 a year in sales because it's so easy. Mobile is magnifying lens for UI problems.

  • Bare-Metal Multicore Performance in a General-Purpose Operating System. James Aguilar with a good TLDR: we know whether we have any work to do in the kernel, and when the next work is. If there is no work to do now, and no known work to do in the future, there will never be any work to do in the future unless it is scheduled via some interrupt (explicit timer, keyboard input, the NIC, disk, etc.). In the meantime, you have tickless behavior.

  • Greg Linden cooked up a new batch of tasty Quick Links

Reader Comments (3)

Hi Todd,

The headline grabbing 12 million concurrent connections is based on a simulation, not 12 million actual concurrent connections. Actually connecting anywhere near 1 million let alone 12 million concurrent connections to a commodity dell server is harder than it looks. Plus at a rate of 1 message per minute per client, is this really the kind of scalability most use cases are looking for? To put this in perspective, at 25 thousand concurrent connections this system may be processing 25 million messages a minute. Unless these kinds of claims can actually be verified then they shouldn't be quoted, especially when this has been very much a solved problem:

In 2008 Richard Jones broke the 1 million connection/user barrier with erlang: http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3

Sometimes 'stuff the internet says' is just stuff with little or no value. Otherwise a nice set of quotes there.

Cheers,

Darach.

July 9, 2013 | Unregistered CommenterDarach

Darach,

I'm afraid I totally disagree with you on your comment. According to your profile, you work for Push Technology, a competitor of MigratoryData for which I work for. Of course, I'm open to criticism from competitors, but let's see, point by point, your arguments:

> The headline grabbing 12 million concurrent connections is based on a simulation, not 12 million actual concurrent connections.

There are 12 actual concurrent TCP connections. If you do a netstat on the Dell R610 machine you have 12 million established socket connections. Are these actual connections or simulated ones? I can attach screenshots with netstat or JMX monitoring stats if you prefer.

> Actually connecting anywhere near 1 million let alone 12 million concurrent connections to a commodity dell server is harder than it looks.

Can you argument this? Because you cite the Richard Jones's 1M concurrent connections result maybe you think it is very difficult to achieve 1M concurrent connections and 12M concurrent connections seems to you unrealistic. Please note that we've achieved 1M concurrent connections 3 years ago as mentioned in my post, but now we've achieved 12M concurrent connections on a Dell R610.

> Plus at a rate of 1 message per minute per client, is this really the kind of scalability most use cases are looking for?

I mentioned in my post that such an use case can be useful for push notifications. A push notification is like an SMS. I think a frequency of one push notification per minute to each user is more than reasonable.

> To put this in perspective, at 25 thousand concurrent connections this system may be processing 25 million messages a minute.

For the 12M concurrent users scenario where each user receives a message every minute, the message throughput is 200,000 messages per second or 12 million messages per minute. The payload of each message is 512 bytes.

I think you wanted to say "25 million concurrent connections" instead of "25 thousand concurrent connections", and yes the system will handle 25 million messages a minute. 25M messages per minute means about 400,000 messages per second. This is not an exceptional high message throughput. In the finance industry you have systems which delivers millions of messages per second.

For example, in another benchmark we've achieved message publication at near 2 million messages per second (or if you want 120 million messages per minute) almost saturating the 10 Gbps Ethernet:

http://mrotaru.wordpress.com/2013/03/27/migratorydata-demonstrates-record-breaking-8x-higher-websocket-scalability-than-competition/

So, what's the point of your observation?

> Unless these kinds of claims can actually be verified then they shouldn't be quoted

The Benchmark Kit of MigratoryData WebSocket Server is available for anyone who wants to repeat this 12M concurrent users result. Just contact us.

Thanks,
Mihai

July 18, 2013 | Unregistered CommenterMihai Rotaru

The new blog post Scaling to 12 Million Concurrent Connections: How MigratoryData Did It offers insight and lessons learned while pushing the boundaries of scalability with MigratoryData WebSocket Server.

October 11, 2013 | Unregistered CommenterMihai Rotaru

PostPost a New Comment

Enter your information below to add a new comment.
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>