Entries in Example (248)

Tuesday
Apr232013

Facebook Secrets of Web Performance

This is a repost of part 1 of an interview I did for the Boundary blog.

Boundary: What is Facebook’s secret sauce for managing what’s got to be the biggest Big Data project, if you will, on the Web?

Hoff: From several presentations we’ve learned what Facebook insiders like Aditya Agarwal and Robert Johnson, both former Directors of Engineering, consider their secret sauce:

Click to read more ...

Monday
Apr152013

Scaling Pinterest - From 0 to 10s of Billions of Page Views a Month in Two Years

Pinterest has been riding an exponential growth curve, doubling every month and half. They’ve gone from 0 to 10s of billions of page views a month in two years, from 2 founders and one engineer to over 40 engineers, from one little MySQL server to 180 Web Engines, 240 API Engines, 88 MySQL DBs (cc2.8xlarge) + 1 slave each, 110 Redis Instances, and 200 Memcache Instances.

Stunning growth. So what’s Pinterest's story? To tell their story we have our bards, Pinterest’s Yashwanth Nelapati and Marty Weiner, who tell the dramatic story of Pinterest’s architecture evolution in a talk titled Scaling Pinterest. This is the talk they would have liked to hear a year and half ago when they were scaling fast and there were a lot of options to choose from. And they made a lot of incorrect choices.

This is a great talk. It’s full of amazing details. It’s also very practical, down to earth, and it contains strategies adoptable by nearly anyone. Highly recommended.

Two of my favorite lessons from the talk:

  1. Architecture is doing the right thing when growth can be handled by adding more of the same stuff. You want to be able to scale by throwing money at a problem which means throwing more boxes at a problem as you need them. If you are architecture can do that, then you’re golden.
  2. When you push something to the limit all technologies fail in their own special way. This lead them to evaluate tool choices with a preference for tools that are: mature; really good and simple; well known and liked; well supported; consistently good performers; failure free as possible; free. Using these criteria they selected: MySQL, Solr, Memcache, and Redis. Cassandra and Mongo were dropped.

These two lessons are interrelated. Tools following the principles in (2) can scale by adding more boxes. And as load increases mature products should have fewer problems. When you do hit problems you’ll at least have a community to help fix them.  It’s when your tools are too tricky and too finicky that you hit walls so high you can’t climb over.

It’s in what I think is the best part of the entire talk, the discussion of why sharding is better than clustering, that you see the themes of growing by adding resources, few failure modes, mature, simple, and good support, come into full fruition. Notice all the tools they chose grow by adding shards, not through clustering. The discussion of why they prefer sharding and how they shard is truly interesting and will probably cover ground you’ve never considered before.

Now, let’s see how Pinterest scales:

Click to read more ...

Wednesday
Apr102013

Check Yourself Before You Wreck Yourself - Avocado's 5 Early Stages of Architecture Evolution

In Don’t panic! Here’s how to quickly scale your mobile apps Mike Maelzer paints a wonderful picture of how Avocado, a mobile app for connecting couples, evolved to handle 30x traffic within a few weeks. If you are just getting started then this is a great example to learn from.

What I liked: it's well written, packing a lot of useful information in a little space; it's failure driven, showing the process of incremental change driven by purposeful testing and production experience; it shows awareness of what's important, in their case, user signup; a replica setup was used for testing, a nice cloud benefit. 

Their Biggest lesson learned is a good one:

It would have been great to start the scaling process much earlier. Due to time pressure we had to make compromises –like dropping four of our media resizer boxes. While throwing more hardware at some scaling problems does work, it’s less than ideal.

Here's my gloss on the article:

Evolution One - Make it Work

Click to read more ...

Monday
Apr012013

Khan Academy Checkbook Scaling to 6 Million Users a Month on GAE

Khan Academy is a non profit company started by Salman Khan with the Big Hairy Audacious Goal of providing a free, world class education to anyone, anywhere, anytime. That’s a lot of knowledge. Having long been inspired and captivated by the Khan Academy, I was really curious to know how they plan to do it. Ben Kamens, lead developer at Khan Academy, gives the somewhat surprising answer in an interview: How to Scale your Startup to Millions of Users.

The short answer: develop a strong team, focus on features, let Google App Engine do the heavy lifting.

Some people seem to be turned off by all the GAE love in the interview. Part of it is that the interviewer is Fred Sauer, Developer Advocate for Google App Engine, so there’s a level of familiarity between the two. But the biggest part is simply that they really like GAE, for all the reasons your are supposed to like GAE. And that’s OK. In this day and age you are free to love whichever platform you choose.

Biggest surprise:

  • A profile on 60 Minutes drove more traffic than TechChrunch, HackerNews, and everything else combined. Old media is not dead.

Part I liked the best:

  • GAE is an abstraction over all the typical scalability issues and that let’s you focus on business problems. All abstractions leak, you are going to have to deal with problems no matter what you choose, but you are choosing the type of problems you want  to deal with by the platform you select. It's all about understanding the tradeoffs you're making.

Here’s my gloss on the major takeaways from the interview:

Click to read more ...

Wednesday
Mar132013

Iron.io Moved From Ruby to Go: 28 Servers Cut and Colossal Clusterf**ks Prevented

For the last few months I've been programming a system in Go, so I'm always on the lookout for information to feed my confirmation bias. An opportunity popped up when Iron.io wrote about their experience using Go to rewrite IronWorker, their ever busy job execution system, originally coded in Ruby.

The result:

Click to read more ...

Monday
Feb252013

SongPop Scales to 1 Million Active Users on GAE, Showing PaaS is not Passé

Should you use PaaS for your next project? Often the answer is no because you want control, but here's an example from SongPop showing why the promise of PaaS is not passé. SongPop was able to autoscale to 60 million users, 1 million daily active users, deliver 17 terabytes/day of songs and images worldwide, handle 10k+ queries/second, all with a 6 person engineering team, and only one engineer working full-time on the backend.

Unfortunately there aren't a lot of details, but what there is can be found in Scaling SongPop to 60 million users with App Engine and Google Cloud Storage. The outline follows the script. You start small. Let PaaS do the heavy lifting. And when you need to scale you just buy more resources and tune a little (maybe a lot). The payoff is you get to focus on feature development and can get by with a small team.

Here's a diagram of their architecture:

Click to read more ...

Monday
Jan282013

DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing

This is an interview with Gabriel Weinberg, founder of Duck Duck Go and general all around startup guru, on what DDG’s architecture looks like in 2012.

Innovative search engine upstart DuckDuckGo had 30 million searches in February 2012 and averages over 1 million searches a day. It’s being positioned by super investor Fred Wilson as a clean, private, impartial and fast search engine. After talking with Gabriel I like what Fred Wilson said earlier, it seems closer to the heart of the matter: We invested in DuckDuckGo for the Reddit, Hacker News anarchists.
                  
Choosing DuckDuckGo can be thought of as not just a technical choice, but a vote for revolution. In an age when knowing your essence is not about about love or friendship, but about more effectively selling you to advertisers, DDG is positioning themselves as the do not track alternative, keepers of the privacy flame. You will still be monetized of course, but in a more civilized and anonymous way. 

Pushing privacy is a good way to carve out a competitive niche against Google et al, as by definition they can never compete on privacy. I get that. But what I found most compelling is DDG’s strong vision of a crowdsourced network of plugins giving broader search coverage by tying an army of vertical data suppliers into their search framework. For example, there's a specialized Lego plugin for searching against a complete Lego database. Use the name of a spice in your search query, for example, and DDG will recognize it and may trigger a deeper search against a highly tuned recipe database. Many different plugins can be triggered on each search and it’s all handled in real-time.

Can’t searching the Open Web provide all this data? No really. This is structured data with semantics. Not an HTML page. You need a search engine that’s capable of categorizing, mapping, merging, filtering, prioritizing, searching, formatting, and disambiguating richer data sets and you can’t do that with a keyword search. You need the kind of smarts DDG has built into their search engine. One problem of course is now that data has become valuable many grown ups don’t want to share anymore.

Being ad supported puts DDG in a tricky position. Targeted ads are more lucrative, but ironically DDG’s do not track policies means they can’t gather targeting data. Yet that’s also a selling point for those interested in privacy. But as search is famously intent driven, DDG’s technology of categorizing queries and matching them against data sources is already a form of high value targeting.

It will be fascinating to see how these forces play out. But for now let’s see how DuckDuckGo implements their search engine magic...

Information Sources

Click to read more ...

Monday
Jan212013

Processing 100 Million Pixels a Day - Small Amounts of Contention Cause Big Problems at Scale

This is a guest post by Gordon Worley, a Software Engineer at Korrelate, where they correlate (see what they did there) online purchases to offline purchases.

Several weeks ago, we came into the office one morning to find every server alarm going off. Pixel log processing was behind by 8 hours and not making headway. Checking the logs, we discovered that a big client had come online during the night and was giving us 10 times more traffic than we were originally told to expect. I wouldn’t say we panicked, but the office was certainly more jittery than usual. Over the next several hours, though, thanks both to foresight and quick thinking, we were able to scale up to handle the added load and clear the backlog to return log processing to a steady state...

Click to read more ...

Monday
Jan142013

MongoDB and GridFS for Inter and Intra Datacenter Data Replication 

This is a guest post by Jeff Behl, VP Ops @ LogicMonitor. Jeff has been a bit herder for the last 20 years, architecting and overseeing the infrastructure for a number of SaaS based companies.  

Data Replication for Disaster Recovery

An inevitable part of disaster recovery planning is making sure customer data exists in multiple locations.  In the case of LogicMonitor, a SaaS-based monitoring solution for physical, virtual, and cloud environments, we wanted copies of customer data files both within a data center and outside of it.  The former was to protect against the loss of individual servers within a facility, and the latter for recovery in the event of the complete loss of a data center.

Where we were:  Rsync

Click to read more ...

Monday
Jan072013

Analyzing billions of credit card transactions and serving low-latency insights in the cloud

This is a guest post by Ivan de Prado and Pere Ferrera, founders of Datasalt, the company behind Pangool and Splout SQL Big Data open-source projects.

The amount of payments performed using credit cards is huge. It is clear that there is inherent value in the data that can be derived from analyzing all the transactions. Client fidelity, demographics, heat maps of activity, shop recommendations, and many other statistics are useful to both clients and shops for improving their relationship with the market. At Datasalt we have developed a system in collaboration with the BBVA bank that is able to analyze years of data and serve insights and statistics to different low-latency web and mobile applications.

The main challenge we faced besides processing Big Data input is that the output was also Big Data, and even bigger than the input. And this output needed to be served quickly, under high load.

The solution we developed has an infrastructure cost of just a few thousands of dollars per month thanks to the use of the cloud (AWS), Hadoop and Voldemort. In the following lines we will explain the main characteristics of the proposed architecture.

Data, goals and first decisions

Click to read more ...