advertise
Friday
May172013

Stuff The Internet Says On Scalability For May 17, 2013

Hey, it's HighScalability time:

 

  • Google I/O to world: Just try to keep up with us. You can't. But go ahead and try. Nah na na na nah...

  • 17 billion: Google Cloud Messaging messages per day with 60ms latency; 1B page views: 500px; 121 billion:  edge graph using Titan; 4 billion hours: hours watched on Netflix per quarter; 4.5 trillion: BigTable transactions per month

  • Quotable Quotes:
    • to3m: As with any time you make plans for the future, sometimes you get it wrong. Ars longa vita brevis, and all that.
    • Callaghan’s law: a given row can’t be modified more than once per RTT
    • Josh Haberman: I had an epiphany one day when I realized that the kernel is nothing but a library with an expensive calling convention.
    • fread2281: Insane speed calls for insane measures.
    • Luke Gorrie: hardware really wants to run fast and you only need to avoid getting in the way --  not too hard if you write the whole stack to match your application, but very hard if you depend on abstractions and misunderstand what's going on.
    • Francis Stephens: This exposes an important, and to me non-obvious, property of concurrency. That it's not the locking that's really hard, it's how to be sure that every piece of related data is included in the lock (or STM).
    • @jamesurquhart: "Complexity is a characteristic of the system, not of the parts in it." -Dekker
    • Colin Scott: out of all the datacenter links types, the average downtime was 0.3 days. This translates to roughly three and a half 9’s of reliability, an order of magnitude greater than WAN links.
    • @adocortes: GPU vs CPU 40x faster for image processing in clusters

  • Really fast growth really does happen says someone somewhere: Dots game from Betaworks hits 100 million game plays in first 2 weeks

  • If you love something you should set it free or lose everything. Fred Wilson observes: This is a classic case of the innovator's dilemma. RIM felt that letting BBM out in the open would make it easier for Blackberry users to leave. So they kept it proprietary. For way too long. Now they no longer have a dominant smartphone franchise or a dominant mobile messenger franchise.

  • When Big Data ecosystems start merging it's not the end of the world, but building a different world: Amex to tap big data (TripAdvisor) to expose fake reviews.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Thursday
May162013

Paper: Warp: Multi-Key Transactions for Key-Value Stores

Looks like an interesting take on "a completely asynchronous, low-latency transaction management protocol, in line with the fully distributed NoSQL architecture."

Warp: Multi-Key Transactions for Key-Value Stores overview:

Implementing ACID transactions has been a longstanding challenge for NoSQL systems. Because these systems are based on a sharded architecture, transactions necessarily require coordination across multiple servers. Past work in this space has relied either on heavyweight protocols such as Paxos or clock synchronization for this coordination.

This paper presents a novel protocol for coordinating distributed transactions with ACID semantics on top of a sharded data store. Called linear transactions, this protocol achieves scalability by distributing the coordination task to only those servers that hold relevant data for each transaction. It achieves high performance by serializing only those transactions whose concurrent execution could potentially yield a violation of ACID semantics. Finally, it naturally integrates chain-replication and can thus tolerate faults of both clients and servers. We have fully implemented linear transactions in a commercially available data store. Experiments show that the throughput of this system achieves 1-9× more throughput than MongoDB, Cassandra and HyperDex on the Yahoo! Cloud Serving Benchmark, even though none of the latter systems provide transactional guarantees.

Wednesday
May152013

Lesson from Airbnb: Give Yourself Permission to Experiment with Non-scalable Changes

If you are stuck drowning in too much data and too many options and are dazzled by all the possibilities of code, here's a helpful bit of advice from Airbnb's rags to riches origin story: it's okay to do things that don’t scale

A corollary is the idea of paying attention to and learning from what your users are actually doing and let that lead you without out that annoying voice in your head second guessing you, yelling but that will never scale! Worry about building something good, then worry about making it scale.

In Airbnb's case they noticed people weren't booking rooms because the pictures sucked. So they flew to New York and shot some beautiful images. This is a very non-scalable and non-technical solution. Yet it was the turning point for Airbnb and sparked their climb out of the "trough of sorrow." Previously they had been limited by the Silicon Valley idea that every feature had to be scalable. Not every solution can be found behind a computer screen.

For the full story please read How design thinking transformed Airbnb from a failing startup to a billion dollar business.

Related Articles

Tuesday
May142013

Sponsored Post: Dow Jones, Spotify, Evernote, Surge, Rackspace, Amazon, Booking, aiCache, Aerospike, Percona, ScaleOut, New Relic, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Amazing things are happening at Dow Jones – help build the next generation News and Media platforms that serve the best journalism in the world. High-impact, passionate, and driven technologists thrive in our environment, building platforms that deliver trusted content that enlightens and inspires millions around the world.  Please apply online
  • Want to build scalable systems that power the world's largest music streaming service? Spotify is looking for engineers for our backend infrastructure team. Apply now.
  • At Evernote our vision is to help the world remember everything. If you want to work in a face paced, highly rewarding environment with some of the smartest engineers on the planet, then come join us! We are looking for Sr. Security Engineers and Sr. Operations Engineers/DevOps to join our operations team.
  • LogicMonitor is looking for a Front End developer to have a huge impact, be valued, realize their dreams, and help us realize ours. We are looking for someone to own the code that delivers the design and usability of LogicMonitor's enterprise SaaS application(s). Please apply online
  • We need awesome people @ Booking.com - We want YOU! Come design next generation interfaces, solve critical scalability problems, and hack on one of the largest Perl codebases. Please apply online.
  • The AWS Relational Database Service (RDS) automates management of relational databases in the cloud. We have a wide variety of customers and are part of many mission-critical applications, like the ones built by the 2012 Obama re-election campaign. If you're interested in joining a fast-growing service and team, please send your resume to rds-jobs@amazon.com.
  • New Relic is looking for a Java Scalability Engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter?  http://newrelic.com/about/jobs

Fun and Informative Events

  • Surge - The Scalability & Performance Conference, presented by OmniTI is happening on Sept. 12th-13th. Special, High Scalability Reader Rate: $50 off registration--now through September 10!
  • It's back! Join the MySQL Community at the annual Percona Live MySQL Conference and Expo in Santa Clara, April 22-25. This year's conference features an outstanding lineup of 92 speakers delivering 112 breakout sessions over three days! 

Cool Products and Services

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
May132013

The Secret to 10 Million Concurrent Connections -The Kernel is the Problem, Not the Solution

Now that we have the C10K concurrent connection problem licked, how do we level up and support 10 million concurrent connections? Impossible you say. Nope, systems right now are delivering 10 million concurrent connections using techniques that are as radical as they may be unfamiliar.

To learn how it’s done we turn to Robert Graham, CEO of Errata Security, and his absolutely fantastic talk at Shmoocon 2013 called C10M Defending The Internet At Scale.

Robert has a brilliant way of framing the problem that I’ve never heard of before. He starts with a little bit of history, relating how Unix wasn’t originally designed to be a general server OS, it was designed to be a control system for a telephone network. It was the telephone network that actually transported the data so there was a clean separation between the control plane and the data plane. The problem is we now use Unix servers as part of the data plane, which we shouldn’t do at all. If we were designing a kernel for handling one application per server we would design it very differently than for a multi-user kernel. 

Which is why he says the key is to understand:

  • The kernel isn’t the solution. The kernel is the problem.

Which means:

  • Don’t let the kernel do all the heavy lifting. Take packet handling, memory management, and processor scheduling out of the kernel and put it into the application, where it can be done efficiently. Let Linux handle the control plane and let the the application handle the data plane.

The result will be a system that can handle 10 million concurrent connections with 200 clock cycles for packet handling and 1400 hundred clock cycles for application logic. As a main memory access costs 300 clock cycles it’s key to design in way that minimizes code and cache misses.

With a data plane oriented system you can process 10 million packets per second. With a control plane oriented system you only get 1 million packets per second.

If this seems extreme keep in mind the old saying: scalability is specialization. To do something great you can’t outsource performance to the OS. You have to do it yourself.

Now, let’s learn how Robert creates a system capable of handling 10 million concurrent connections...

Click to read more ...

Friday
May102013

Stuff The Internet Says On Scalability For May 10, 2013

Hey, it's HighScalability time:

 

  • Nanoscale: Plants IM Using Nanoscale Sound Waves; 100 petabytes: CERN data storage
  • Quotable Quotes:
    • Geoff Arnold: Arguably all interesting advances in computer science and software engineering occur when a resource that was previously scarce or expensive becomes cheap and plentiful.
    • @jamesurquhart: "Complexity is a characteristic of the system, not of the parts in it." -Dekker
    • @louisnorthmore: Scaling down - now that's scalability!
    • @peakscale: Where distributed systems people retire to forget the madness: http://en.wikipedia.org/wiki/Antipaxos 
    • @dozba: "The Linux Game Database" ... Well, at least they will never have scaling problems.
    • Michael Widenius: There is no reason at all to use MySQL
    • @steveloughran: Whenever someone says "unlimited scalability", ask if that exceeds the berkenstein bound
    • @nationofminds: "I have infinite MIPS. Unlimited scalability. And zero effing patience." 
    • Endowing cells with logic and memory: Genetic circuits that process and permanently store information are created with recombinases that flip the orientation of DNA cassettes.

  • Search Is Eating The World. The long sought after Nirvana of search and database becoming one may be nigh. 

  • And you thought scalability didn't pay: Twitter Acquires Palo Alto-Based Scalable Computing Startup Ubalo

  • New Finds: @foodfight is an interesting and informative Chef oriented DevOps podcast you may enjoy if that's the sort of thing you enjoy, which you probably do. From which I learned from fellow Way of Kings aficionado Brandon Burton about a new deep systems podcast called Real Talk by James Golick and Joe Damato, who want to talk about things concrete, not like that Hacker News BS.

  • I'd love to see the API: The idea we live in a simulation isn't science fiction. Magic anyone?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
May082013

Typesafe Interview: Scala + Akka is an IaaS for Your Process Architecture

This is an email interview with Viktor Klang, Director of Engineering at Typesafe, on the Scala Futures model & Akka, both topics on which is he is immensely passionate and knowledgeable.

How do you structure your application? That’s the question I explored in the article Beyond Threads And Callbacks. An option I did not talk about, mostly because of my own ignorance, is a powerful stack you may not be all that familiar with: Scala and Akka.

To remedy my oversight is our acting tour guide, Typesafe’s Viktor Klang, long time Scala hacker and Java enterprise systems architect. Viktor was very patient in answering my questions and was enthusiastic about sharing his knowledge. He’s a guy who definitely knows what he is talking about.

I’ve implemented several Actor systems along with the messaging infrastructure, threading, async IO, service orchestration, failover, etc, so I’m innately skeptical about frameworks that remove control from the programmer at the cost of latency.

So at the end of the interview am I ready to drink the koolaid? Not quite, but I’ll have a cup of coffee with the idea. 

I came to think of Scala + Akka as a kind of a IaaS for your process architecture. Toss in Play for the web framework and you have a slick stack, with far more out of the box power than Go, Node, or plaino jaino Java.

The build or buy decision is surprisingly similar to every other infrastructure decision you make. Should you use a cloud or build your own? It’s the same sort of calculation you need to go through when deciding on your process architecture. While at the extremes you lose functionality and flexibility, but since they’ve already thought of most everything you would need to think about, with examples, and support, you gain a tremendous amount too. Traditionally, however, processes architecture has been entirely ad-hoc. That may be changing. 

Now, let’s start the interview with Viktor...

Click to read more ...

Tuesday
May072013

Not Invented Here: A Comical Series on Scalability 

I read one of these poignantly humorous comics on Not Invented Here a while back and since I wasn't sure it was OK to repost I emailed asking for permission. Nada. Then I saw Martijn de Vrieze posted a collection of scalability comics from NIH and decided what the heck (click image to read on site):

Thanks to Martijn for curating the collection and NIH for creating them.

And I agree with Martijn, they do capture an ineffable quality about the entire space.

Monday
May062013

7 Not So Sexy Tips for Saving Money On Amazon

Harish Ganesan CTO of 8KMiles has a very helpful blog, Cloud, Big Data and Mobile, where he shows a nice analytical bent which leads to a lot of practical advice and cost saving tips:
  1. Use SQS Batch Requests to reduce the number of requests hitting SQS which saves costs. Sending 10 messages in a single batch request which in the example save $30/month.
  2. Use SQS Long Polling to reduce extra polling requests, cutting down empty receives, which in the example saves ~$600 in empty receive leakage costs.
  3. Choose the right search technology choice to save costs in AWS by matching your activity pattern to the technology. For a small application with constant load or a heavily utilized search tier or seasonal loads Amazon Cloud Search looks like the cost efficient play. 
  4. Use Amazon CloudFront Price Class to minimize costs by selecting the right Price Class for your audience to potentially reduce delivery costs by excluding Amazon CloudFront’s more expensive edge locations.
  5. Optimize ElastiCache Cluster costs by right sizing cluster node sizes. For different usage scenarios (heavy, moderate, low) their are optimal instances types. Choosing the right type for the right usage scenario saves money.
  6. Amazon Auto Scaling can save costs by better matching demand and capacity. Certainly not a new idea but the diagrams, different leakage scenarios (daily spike, weekly fluctuation, seasonal spike), and the explanation of potential savings (substantial) are well done.
  7. Use Amazon S3 Object Expiration feature to delete old backups, logs, documents, digital media, etc. A leakage of ~20 TB adds up to a tidy ~1650 USD a year. 
Friday
May032013

Stuff The Internet Says On Scalability For May 3, 2013

Hey, it's HighScalability time:

 

  • 1,966,080 cores: Time Warp synchronization protocol using up to 7.8M MPI tasks on 1,966,080 cores of the {Sequoia} Blue Gene/Q supercomputer system. 33 trillion events processed in 65 seconds yielding a peak event-rate in excess of 504 billion events/second using 120 racks of Sequoia.
  • Quotable Quotes:
    • Thad Starner: the longer accessing a device exceeds 2s, the more its actually usage would decrease exponentially. Thus, he made a claim that wrist watch interface always sitting on one's wrist ready to use should be more successful than mobile phones which have to pulled out of the pocket. 
    • @joedevon: We came for scalability but we stayed for agility #NoSQL
    • @jahmailay: "Our user base is exploding. I really wish we spent more time on scalability instead of features customers don't use." - Everybody, always.
    • @bsletten: I don’t think it is a coincidence that the words eval() and evil are so close.
    • @RCSecure: Maybe Gov should stop deploying crappy #CyberSecurity instead of Surveiling Citizens
    • @davidpav: "This is what Netflix does - after each deployment creates AMI for faster scaling up"
    • @franzgranlund: Rewrote my little batch-processing application using #akka . 20% performance increase just like that - and now it is easier to scale.
    • @marshray: Ouch, that's kind of dismal. Perhaps we need a new term: "eventual scalability"
    • @adrianco: RT @rbranson: @cscotta load average is the worst thing ever. Slowly trying to evangelize it's demise as a reasonable metric. < +1 every 15 m

  • MIT Tech Review picks 10 breakthrough technologies: Smart Watches (really?), Memory implants (deciphering the code by which the brain forms long-term memories), Additive manufacturing (3-D printing), Supergrids (finally says Edison, DC powergrids), Temporary social media (sigh), Prenatal DNA sequencing (great for full lifecycle ad targeting), Baxter (compliant robots), Deep Learning (the singularity is near), Ultra-Efficient Solar Power (now we are talking). Prediction: We'll laugh at all this filter control talk once we have all of Google's datacenters and knowledge graph software implanted in our heads.

  • IBM on making movies using atoms as pixels. Characterization was a little thin but the plot was magnetic.

  • Lesson from Airbnb: Give yourself permission to experiment with non-scalable changes. Building better is better than building bigger.

  • Here's a short review by me on CyberStorm by Matthew Mather. Matthew is also the author of the most excellent Atopia Chronicles, a sprawling exploration of "artificial intelligence, distributed computing, nanotechnology, and the full range of humanity." CyberStorm is a chilling blow by blow of what could happen in a real cyber attack. As a programmer it's the implied idea of a kind of Crises OS built on a mesh of smartphones that I found most fascinating. Not much seems to be done in this area and even the how-to of writing such applications is rarely discussed. Could be interesting.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...