Entries by HighScalability Team (1576)

Wednesday
Aug282013

Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability

This article is a lightly edited version of 20 Obstacles to Scalability by Sean Hull (with permission) from the always excellent and thought provoking ACM Queue.

1. TWO-PHASE COMMIT

Normally when data is changed in a database, it is written both to memory and to disk. When a commit happens, a relational database makes a commitment to freeze the data somewhere on real storage media. Remember, memory doesn't survive a crash or reboot. Even if the data is cached in memory, the database still has to write it to disk. MySQL binary logs or Oracle redo logs fit the bill.

With a MySQL cluster or distributed file system such as DRBD (Distributed Replicated Block Device) or Amazon Multi-AZ (Multi-Availability Zone), a commit occurs not only locally, but also at the remote end. A two-phase commit means waiting for an acknowledgment from the far end. Because of network and other latency, those commits can be slowed down by milliseconds, as though all the cars on a highway were slowed down by heavy loads. For those considering using Multi-AZ or read replicas, the Amazon RDS (Relational Database Service) use-case comparison at http://www.iheavy.com/2012/06/14/rds-or-mysql-ten-use-cases/ will be helpful.

Synchronous replication has these issues as well; hence, MySQL's solution is semi-synchronous, which makes some compromises in a real two-phase commit.

2. INSUFFICIENT CACHING

...

Click to read more ...

Monday
Aug262013

Reddit: Lessons Learned from Mistakes Made Scaling to 1 Billion Pageviews a Month

Jeremy Edberg, the first paid employee at reddit, teaches us a lot about how to create a successful social site in a really good talk he gave at the RAMP conference. Watch it here at Scaling Reddit from 1 Million to 1 Billion–Pitfalls and Lessons.

Jeremy uses a virtue and sin approach. Examples of the mistakes made in scaling reddit are shared and it turns out they did a lot of good stuff too. Somewhat of a shocker is that Jeremy is now a Reliability Architect at Netflix, so we get a little Netflix perspective thrown in for free.

Some of the lessons that stood out most for me: 

  • Think of SSDs as cheap RAM, not expensive disk. When reddit moved from spinning disks to SSDs for the database the number of servers was reduced from 12 to 1 with a ton of headroom. SSDs are 4x more expensive but you get 16x the performance. Worth the cost. 
  • Give users a little bit of power, see what they do with it, and turn the good stuff into features. One of the biggest revelations for me was how much reddit learns from its users and how much it relies on users to make the site run smoothly. Users are going to tell you a lot of things you don’t know. For example, reddit gold started as a joke in the community. They made it a product and users love it.
  • It’s not necessary to build a scalable architecture from the start. You don’t know what your feature set will be when you start out so you want know what your scaling problems will be. Wait until your site grows so you can learn where your scaling problems are going to be.
  • Treat nonlogged in users as second class citizens.  By always giving logged out always cached content Akamai bears the brunt for reddit’s traffic. Huge performance improvement. 

There's lots more. Here's my gloss of the talk where we learn many lessons from the mistakes made in the early days of scaling reddit:

Click to read more ...

Friday
Aug232013

Stuff The Internet Says On Scalability For August 23, 2013

Hey, it's HighScalability time:

  • 5x: AWS vs combined size of other cloud vendors; Every Second on The Internet: Why we need so many servers.
  • Quotable Quotes:
    • @chaliy: Today I learned that I do not understand how #azure scaling works, instance scale does not affect requests/sec I can load.
    • @Lariar: Note how crazy this is. An international launch would have been a huge deal. Now it's just another thing you do.
    • smacktoward: The problem with relying on donations is that people don't make donations.
    • @toddhoffious: Programming is a tool built by logical positivists to solve the problems of idealists and pragmatists. We have a fundamental mismatch here.
    • @etherealmind: Me: "Weird, my phone data isn't working" Them: "They turned the 3G off at the tower because it  interferes with the particle accelerator"
    • John Carmack: In com­puter sci­ence, just about the only thing that’s really sci­ence is when you’re talk­ing about algo­rithms. And opti­miza­tion is an engi­neer­ing. But those don’t actu­ally occupy that much of the total time spent pro­gram­ming. 
    • @gappy3000: Ideas are assets. Code is a liability. So maximize ideas/code.
  • How can spiders and flies walk up walls? See for yourself with a fun DYI on How to: test Galileo's scaling laws. An idea that is simple yet profound in its implications: "the width of an object is doubled, the surface area is squared and the volume is cubed." It means size matters. Elephants can't dance and jump and insects can walk on water. Why is because the ratio of area to volume governs everything we do. You get to drop stuff from great heights and watch things explode (or not). What could be better?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Thursday
Aug222013

The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second edition

Google has released an epic second edition of their ground breaking The Datacenter as a Computer book. It's called an introduction, but at 156 pages I would love to see what the Advanced version would look like!

John Fries in a G+ comment has what I think is a perfect summary of the ultimate sense of the book:

It's funny, when I was at Google I was initially quite intimidated by interacting with an enormous datacenter, and then I started imagining the entire datacenter was shrunk down into a small box sitting on my desk, and realized it was just another machine and the physical size didn't matter anymore

It's such a far ranging book that it's impossible to characterize simply. It covers an amazing diversity of topics, from an introduction to warehouse-scale computing; workloads and software infrastructure; hardware; datacenter architecture; energy and power efficiency; cost structures; how to deal with failures and repairs; and it closes with a discussion of key challenges, which include rapidly changing workloads, building responsive large scale systems, energy proportionality of non-CPU components, overcoming the end of Dennard scaling, and Amdahl's cruel law.

In reading it I get the sense the Faerie Queen has transported us to the land of Faerie, a special other place of timeless truths, where dragons roam, and mortal danger lurks. And if you do escape, nothing is quite the same ever again. 

Abstract:

Click to read more ...

Tuesday
Aug202013

Sponsored Post: Couchbase, Evernote, 10gen, Stackdriver, BlueStripe, Apple, Surge, Booking, Rackspace, aiCache, Aerospike, ScaleOut, New Relic, LogicMonitor, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Evernote is hiring a Senior DevOps Engineer in our mission to help the world remember everything. Our work environment is collaborative and relaxed, our benefits and perks are fantastic, and we enrich the lives of more than 65 million users worldwide every day! Please apply here.

  • Stackdriver is looking for systems + cloud + dev + ops guru to serve as our liaison within the DevOps community. If you are passionate about monitoring and automation, enjoy working on open source, and are excited by the prospect of sharing your expertise with your peers, get in touch with us today! http://bit.ly/143ARmy

  • We need awesome people @ Booking.com - We want YOU! Come design next generation interfaces, solve critical scalability problems, and hack on one of the largest Perl codebases. Please apply online.

  • Apple has multiple openings. Changing the world is all in a day's work at Apple. Imagine what you could do here.
    • Siri Software Engineer. Play a part in the next revolution in human-computer interaction. Contribute to a product that is redefining mobile computing. Create groundbreaking technology for large scale systems, spoken language, big data, and artificial intelligence. And work with the people who created the intelligent assistant that helps millions of people get things done — just by asking. To apply please visit This URL.
    • Siri Software Engineer. Play a part in the next revolution in human-computer interaction. Contribute to a product that is redefining mobile computing. Create groundbreaking technology for large scale systems, spoken language, big data, and artificial intelligence. And work with the people who created the intelligent assistant that helps millions of people get things done — just by asking. To apply please visit This URL.
    • Siri Software Engineer. Play a part in the next revolution in human-computer interaction. Contribute to a product that is redefining mobile computing. Create groundbreaking technology for large scale systems, spoken language, big data, and artificial intelligence. And work with the people who created the intelligent assistant that helps millions of people get things done — just by asking. To apply please visit This URL.
    • Siri Software Engineer. Play a part in the next revolution in human-computer interaction. Contribute to a product that is redefining mobile computing. Create groundbreaking technology for large scale systems, spoken language, big data, and artificial intelligence. And work with the people who created the intelligent assistant that helps millions of people get things done — just by asking. To apply please visit This URL.
    • Software Engineer - Messaging Services. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world. You'll have the opportunity to explore a wide range of technologies, developing the server software that is driving the future of messaging and mobile services. To apply please visit this URL
    • Software Engineer - Messaging Services. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world. You'll have the opportunity to explore a wide range of technologies, developing the server software that is driving the future of messaging and mobile services. To apply please visit this URL
    • iCloud Documents Server Engineer - C++. The iCloud team is looking for a C++ engineer with a strong background in web services development. The successful candidate has demonstrated deep experience in building high-performing systems that are scalable and extensible. This is a terrific position for an engineer interested in developing the next generation of cloud support for iOS and OS X. To apply please visit this URL.
    • Sr Software Engineer-iCloud. An exciting opportunity for a Software Engineer to join Apple's Messaging Services team. We build the cloud systems that power some of the busiest applications in the world, including iMessage, FaceTime and Apple Push Notifications. To apply please visit this URL.
    • SW Engineering Apps Manager. Be a founding member of Apple's newly-minted iCloud Infrastructure Engineering Team! We are designing, building, and supporting new, critical infrastructural systems and frameworks which provide services like structured and unstructured storage, request routing, search queueing, security, and much more. These form the platform upon which many iCloud backend systems will be built. To apply please visit this URL.
    • iCloud Software Engineer. Explore the far reaches of the possible by joining the team building the future of cloud services at Apple!  Consider joining a small team writing the software which forms the foundation for some of our most exciting iCloud products and services. To apply please visit this URL.
    • Sr. Software Engineer - iCloud. iCloud is looking for a talented software engineer who can help us make iCloud even better. Do you love designing & architecting highly scalable, distributed web services? Does the idea of performance tuning Java applications make your heart leap? To apply please visit this URL.
    • Software Engineer - iCloud. iCloud is looking for a talented software engineer who can help us make iCloud even better. Do you love designing & architecting highly scalable, distributed web services? Does the idea of performance tuning Java applications make your heart leap? To apply please visit this URL.
    • Sr. Software Engineer. The Messaging Services team is looking for a talented software engineer to help us develop Apple's messaging platform. In this role, you'll build the server stacks for iMessage, FaceTime, Apple Push Notifications, and other systems. To apply please visit this URL.
    • Sr Software Engineer-Messaging Services. The Messaging Services team is looking for a talented software engineer to help us develop Apple's messaging platform. In this role, you'll build the server stacks for iMessage, FaceTime, Apple Push Notifications, and other systems. To apply please visit this URL.

  • LogicMonitor is looking for a Front End developer to have a huge impact, be valued, realize their dreams, and help us realize ours. We are looking for someone to own the code that delivers the design and usability of LogicMonitor's enterprise SaaS application(s). Please apply online

  • New Relic is looking for a Java Scalability Engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter?  http://newrelic.com/about/jobs

Fun and Informative Events

  • Surge - The Scalability & Performance Conference, presented by OmniTI, Sept. 12th-13th, features speakers from Joyent, Fastly, Dyn, Netflix, Linkedin and Amazon. Special, High Scalability Reader Rate: $50 off registration--through Sept. 10! Book hotel and get $50 off, from OmniTI. 

Cool Products and Services

  • The leading technology companies use Couchbase as their NoSQL database. Download the free open-source version of Couchbase Server and make something awesome today.

  • MongoDB Management Service (MMS) is a cloud-based suite of services for managing MongoDB deployments. In addition to monitoring and alerting, now you can seamlessly back up your MongoDB deployment to the cloud using using MMS. To get started with monitoring and backup, visit mms.10gen.com.

  • BlueStripe FactFinder Express is the ultimate tool for server monitoring and solving performance problems. Monitor URL response times and see if the problem is the application, a back-end call, a disk, or OS resources.

  • AppDynamics is an easy-to-use application performance management solution that offers code-level insight into Java, .NET and PHP applications. Get the free trial.

  • NEW! Aerospike 3 - Download FREE. Introducing the new Aerospike 3 database that builds off of Aerospike's legacy of speed, scale, and reliability, adding an extensible data model that supports complex data types, large data types, queries using secondary indexes, user defined functions (UDFs) and distributed aggregations using Stream UDFs for real-time data.

  • The Rackspace Cloud Application Programming Interface (API)  has changed the game allowing customers to easily modify their cloud configuration with just a few lines of code. The API is a powerful tool and something everyone should know about, regardless of your level of technical ability.

  • aiScaler, aiProtect, aiMobile integrated solutions for Dynamic Site Acceleration, Denial of Service Protection and Simplifying Mobile Content. Free instant trial, no sign-up required . http://aicache.com/

  • ScaleOut Software. In-Memory Data Grids for the Enterprise. Download a Free Trial.

  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.

  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Aug192013

What can the Amazing Race to the South Pole Teach us About Startups?

At the heart of every software adventure exists a journey in service of a quest. Melodramatic much? Sorry, but while wandering dazzled through Race to the End of the Earth, a fantastic exhibit at the Royal BC Museum on the 1911-1912 race to the South Pole between Norwegian explorer Roald Amundsen and British naval officer Robert Scott, I couldn’t help but think of the two radically different approaches each team took to the race and it shocked me to see that some of the same principles that lead to success or failure in software development also seem to lead to success or failure in exploration.

I wish I could reproduce the experience of walking through the exhibit. Plaque after plaque I remember wondering out loud at Scott’s choices and then nod in agreement with Amundsen’s approach. The core conflict was straight out of any ancient Agile (Amundsen) vs Waterfall (Scott) thread you can find on Usenet. And Waterfall lost.

As background here are some sources you may want to read to understand more about the race. Race to The End: Amundsen, Scott, and the Attainment of the South Pole is one of the books they sold at the museum. There are plenty of other books to choose from as well. The race to the South Pole seems like a good online source for the story as does The Tragic Race to Be First to the South Pole in Wired. 

In short: the goal of each expedition was to be first to the South Pole. Each leader approached their task in radically different ways, stemming from their different goals, experiences, and temperaments.  Amundsen’s team arrived in good health 33 days before Scott’s malnourished and exhausted team learned the devastating news that they were too late. Amundsen’s team returned home without losing a single life. Tragically, all five men in Scott's polar party died on the ice returning from the pole. 

For detailed list of the difference between the two teams take a look at Comparison of the Amundsen and Scott Expeditions, 10 Mistakes That Caused the Most Punishing Nature Expedition in History, and The South Pole Fifty Years After. Of course I didn’t have access to this information at the time, so I’ll weave in this data into some software development lessons:

Click to read more ...

Friday
Aug162013

Stuff The Internet Says On Scalability For August 16, 2013

Hey, it's HighScalability time:

  • 1 trillion: edges in Facebook's search graph
  • Quotable Quotes:
    • Miguel de Icaza: Callbacks as our Generations' Go To Statement
    • @kaleidic: "Haskell is a great language as a way of thinking, but I prefer programming in a language where I can cheat."--Meijer
    • T.S. Eliot, who totally got the Internet: Distracted from distraction by distraction
  • The argument eternal: Why Some Startups Say the Cloud Is a Waste of Money. The argument follows a getting back-to-nature pattern. Amazon is the glittering city of compromised values and colos are the familiar places of refuge and virtue. But be honest, don't we all really know the score by now?
  • Fred Wilson: The Similarities Between Building and Scaling a Product and a Company: The system you and your team built will break if you don't keep tweaking it as demand grows. Greg Pass, who was VP Engineering at Twitter during the period where Twitter really scaled, talks about instrumenting your service so you can see when its reaching a breaking point, and then fixing the bottleneck before the system breaks. He taught me that you can't build something that will never break. You have to constantly be rebuilding parts of the system and you need to have the data and processes to know which parts to focus on at what time. The team is the same way.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Tuesday
Aug132013

In Memoriam: Lavabit Architecture - Creating a Scalable Email Service

With Lavabit shutting down under murky circumstances, it seems fitting to repost an old (2009), yet still very good post by Ladar Levison on Lavabit's architecture. I don't know how much of this information is still current, but it should give you a general idea what Lavabit was all about.

Getting to Know You

Click to read more ...

Monday
Aug122013

100 Curse Free Lessons from Gordon Ramsay on Building Great Software

Gordon Ramsay is a world renowned chef with a surprising amount to say on software development. Well, he says it about cooking and running a restaurant, but it applies to software development too.

You may have seen Gordon Ramsay on one of his many TV shows. Hell's Kitchen is a competition between chefs trying to win a dream job: head chef of their own high-end restaurant. On this show Ramsay is judge, jury, and executioner. And he chops off more than a few heads. Kitchen Nightmares is a show where Ramsay is called in by restaurant owners to help turn around their failing restaurants. On this show Ramsay is there to help.

If you just watch Hell's Kitchen you will likely conclude Ramsay is one of the devil's own helpers ("ram" is the symbol of the devil and "say" means he speaks for the devil: Ramsay). Ramsay screams, yells, cusses, belittles, and throws tantrums even a 7 year old could learn from. Then he does it all over gain just for spite. In Hell's Kitchen there's no evidence at all of why Ramsay is such a respected chef. He is just a nasty man.

Now if you watch the British version of Kitchen Nightmares you will see a slightly different side of Ramsay, he still yells and cusses a lot, but you will also see something else: this guy seriously knows what he is doing. The depth of his knowledge in all phases of the restaurant business is immediately apparent as he methodically works to fix what's broken.

Ramsay knows how to run a profitable restaurant. That's one of his key skills. Anyone can lose money running a restaurant, the secret is knowing how to make money running a restaurant. Apparently if, you run it right, a restaurant can make a lot of money. It can also lose a lot of money.

How does Ramsay teach people how to run a profitable restaurant? It's not what you may be thinking. He isn't about cutting costs, shoddy work, and cheap labor. Ramsay is all about profit through excellence and skill. That's what attracts people to a restaurant and keeps them coming back...and back...and back.

It's great fun to watch Ramsay cuss and cajole his way through the entire restaurant staff putting his finger directly on problems, creating inspired solutions, and then mentoring the staff and owners through the transformations needed to become a good restaurant.

While watching you have to wonder how people who invested their life savings in a restaurant could be so screwed up. But then you realize we are all screwed up at one time or another. From the distance TV provides everyone can look bad. Running a restaurant is hard and it's oh so easy to get in a rut.

And if you were stuck in a rut who would you want to give you a lift out? A super hero? How about Super Chef instead?

Sadly, in the real world when you run into trouble a super hero with curiously sharp knives doesn't come to your rescue. In the real world the same uninspired people apply the same uninspired strategies until losses cause the heart stoppage that blissfully puts an end to the torture. But this isn't the real world, this is TV. And on TV you see Ramsay slap the electroshock paddles on the restaurant and bring it back to life. It's wondrous to see both the restaurant and the people come back to life.

What's fun is when Ramsay revisits the restaurant after six or so weeks to see how the restaurant is doing. Most of the time the restaurant is doing better. There are a lot of happy customers and money is being made. And they don't slavishly follow what Ramsay said to do either. Instead, Ramsay taught them the principles of running a restaurant and then they learned how to apply the principles to their own situation.

Sometimes on Ramsay's reinspection tour he finds the restaurant has closed down or isn't doing as well as you might expect. The reasons for failure vary. Sometimes the initial problems were too great. One guy made a bad deal on a lease so it didn't matter in the end if the restaurant improved or not. Sometimes people are simply stubborn and won't change their ways. An example of this scenario was a fancy French restaurant where the owner gave the chef total control to cook whatever he wanted. It turns out the chef was addicted to complex foods that prevented the restaurant from getting its Michelin star.

While watching Ramsay work his magic in Kitchen Nightmares I began to see how similar a kitchen team was to a software team. Running a successful kitchen is a high stress, high work load, high quality, high variability environment where teamwork and communication are key. Sounds like software development to me.

Here are some of the notes I took on Ramsay's restaurant turn around strategies. I'll leave extending the metaphor into the software world in your capable hands:

Click to read more ...

Friday
Aug092013

Stuff The Internet Says On Scalability For August 9, 2013

Hey, it's HighScalability time:

  • 25%: Percentage of North American Internet Traffic served by Google
  • Quotable Quotes:
    • Aristotle: We must not expect more precision than the subject-matter admits.
    • Bret Victor: Technologies change quickly and minds change slowly. Ideas require people to unlearn what they’ve learned and adopt new ideas. They think what they’ve learned is programming and this new stuff isn’t programming.
    • Steven Roberts: Art without engineering is dreaming; engineering without art is calculating. 
    • @jamesurquhart: “@b6n: @jamesurquhart @krishnan @rUv There is no such thing as traditional infrastructure at web scale.” < My point exactly.
    • HackerNewsOnion: Batch is the new realtime. 
    • @rbranson: "How do we store this at scale?" "Redis on a cr1.8xlarge" "lmao"
    • John Carmack: We are fundamentally creativity bound at this point. We need faster iterations.
    • ReadWrite: The tech sector as a whole has created more than a trillion dollars in value over the past decade. Yet that value creation is incredibly concentrated. Nearly two-thirds of the increase reaped by investors and employees comes from Apple and Google alone, with the likes of Amazon, Facebook, LinkedIn, eBay, Yandex and Baidu rounding out the list.
  • Robert Scoble on how fast companies are growing these days compared to the past. Talking with the Glide CEO, makers of a mobile video chat app, he said in two months they grew to 3.5 million users. Robert compared this growth rate to Twitter's, which when it came out had 13,000 users in 6 months. In six years our expectations and our world have sped up. Reduction of friction has driven faster adoption. OAuth is one friction reduction mechanism because it makes onboarding so much easier and faster.
  • I was totally suckered in by Bret Victor's The Future of Programming. An artful performance two steps above the usual. When he starts listing the future of programming as the direct manipulation of data; goals and constraints; spatial representations; concurrency, I'm thinking wow, this must be the early 1970s and he's talking about these things and we still don't have them today. We are basically still using Fortran. And that was the point. Mu. 
  • The ultimate data storage system: DNA. A New Approach to Information Storage: In Church's case, a team of researchers used sequencing technology to format his 54,000-word book (with words, images, and a JavaScript program, it came down to 5.27 megabits, or 658.75 bytes) at a density of 5.5 petabytes per cubic millimeter. While the physical volume of 70 billion physical copies of his book would fill nearly 3,500 New York City Public Libraries (including all branches), and a digital version would require somewhere in the neighborhood of 46 storage devices with 1TB drives, all those copies of Church's book fit on a piece of DNA no larger than a speck of dust. What's more, the copies will last hundreds of thousands of years—perhaps even a million years—and do not require any special handling or temperature conditions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...