Entries by HighScalability Team (1576)

Friday
Oct122012

Stuff The Internet Says On Scalability For October 12, 2012

It's HighScalability Time:

  • Quotable Quotes:
    • @justinesherry: First lesson watching @eric_brewer's keynote at #ricon2012 -- distributed systems make you go bald. (Both audience and speaker!)
    • @rodos: With so much manure in the room of #BigData the must be a pony in here somewhere!
    • @adron: #ricon2012 OH "it isn't cloud, nosql or whatever, it's distributed systems... That is the change!" Smartest thing stated at a conf in ages.
    • @adron: OH "What's the hello world of distributed systems?" "Twitter."
    • Mikael Ronstrom: A long career on distributed systems has learnt me that it is extremely important to do proper partitioning of data sets to achieve scalability on network level. But it is still extremely important to make each node in the distributed system as large as possible. 
    • Dan Rayburn: CDNs Account For 40% Of The Overall Traffic Volume Flowing Into ISP Networks
  • One minute and 19 seconds into launch the Falcon 9 lost one of its nine engines. Software apapted by detecting engine failure, cutting the fuel supply, and then distributing the unused propellant to the remaining engines, allowing them to burn longer. How cool is that! That's High Availability in practice. That smooth SpaceX launch? Turns out one of the engines came apart.

  • When is kilo 1000 and when is it 1024? Thank the bit lords the values are so close because there's a kiloton of confusion on this point, as is shown by the conversation around Whats the Difference Between Kbps and kBps ?
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Thursday
Oct112012

RAMCube: Exploiting Network Proximity for RAM-Based Key-Value Store

RAMCube is a datacenter oriented design for RAM-based key-value store that supports thousands or tens of thousands of servers to offer up to hundreds of terabytes of RAM storage. Here's the PDF Paper describing the system and here's a video of the presentation given at HotCloud.

The big idea is: RAMCube exploits the proximity of a BCube network to construct a symmetric MultiRing structure, restricting all failure detection and recovery traffic within a one-hop neighborhood, which addresses problems including false failure detection and recovery traffic congestion. In addition, RAMCube leverages BCube’s multiple paths between any pairs of servers to handle switch failures.

A few notes:

  • 75% of Facebook data is stored in memcache.
  • RAM is 1000 time faster than disk
  • RAM is used in caches, but this increases application complexity as applications are responsible for cache consistency.
  • Under a high work load a 1% cache miss rate can lead to a 10x performance penalty.
  • ...

Click to read more ...

Wednesday
Oct102012

Antirez: You Need to Think in Terms of Organizing Your Data for Fetching

Salvatore Sanfilippo wrote a response to Michel Martens' An Open Minded Reader. There's nothing in the post or response that's controversial. I was just struck at what a clear explication the conversation was on all the effort that goes into optimizing read paths. We optimize reads through denormalisation, a crazy quilt of caching layers, key-value databases, clustering of related tables, SSD/RAM, DHTs, moving functions to storage, secondary indexes, separating OLAP from OLTP, etc etc. We often focus so much on specific techniques that we can forget the bigger picture of what's going on. This little exchange made me look again at the forest, not just the trees.

Michel Martens:

Click to read more ...

Monday
Oct082012

How UltraDNS Handles Hundreds of Thousands of Zones and Tens of Millions of Records

This is a guest post by Jeffrey Damick, Principal Software Engineer for Neustar. Jeffrey has overseen the software architecture for UltraDNS for last two and half years as it went through substantial revitalization.

UltraDNS is one the top the DNS providers, serving many top-level domains (TLDs) as well as second-level domains (SLDs). This requires handling of several hundreds of thousands of zones with many containing millions of records each. Even with all of its success UltraDNS had fallen into a rut several years ago, its release schedule had become haphazard at best and the team was struggling to keep up with feature requests in a waterfall development style.

Development

Click to read more ...

Thursday
Oct042012

Stuff The Internet Says On Scalability For October 5, 2012

It's HighScalability Time:

  • 30 Million: Lady Gaga Twitter followers; 1 Billion: active Facebook users; 
  • Quotable Quotes:
    • @mappingbabel: Oracle exec says "we're not competing with Amazon for Netflix, we're competing with Amazon for Boeing,"
    • @CompSciFact: "Most software looks more like a whirlpool than a pipeline." #gotoaar
    • @ibogost: When someone says "Big Data," I always check to see if I still have my wallet.
    • @beezly: Google Compute Engine is "LIKE A HUGE SPACE LASER" #FACT #velocityconf
    • @rem: Bandwidth stops helping speed up web sites around 5mb (in a mobile context), yet reducing latency has a linear positive impact #VelocityConf
  • Funny take on a Hacker News front page

  • What We Found Scanning Millions of Email Systems: We found that, on average, it took 0.3 second to establish a connection, and 1.4 seconds to complete an SMTP transaction < So why do we need a new messasge bus again when email works just fine?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Thursday
Oct042012

LinkedIn Moved from Rails to Node: 27 Servers Cut and Up to 20x Faster

Update: More background by Ikai Lan, who worked on the mobile server team at LinkedIn, says some facts were left out: the app made "a cross data center request, guys. Running on single-threaded Rails servers (every request blocked the entire process), running Mongrel, leaking memory like a sieve." Which explains why any non-blocking approach would be a win. And Ikai, I hope as you do that nobody reads HS and just does what somebody else does without thinking. The goal here is information that you can use to make your own decisions.

Ryan Paul has written an excellent behind-the-scenes look at LinkedIn’s mobile engineering. While the mobile part of the story--23% mobile usage; focus on simplicity, ease of use, and reliability; using a room metaphor; 30% native, 80% HTML; embedded lightweight HTTP server; single client-app connection--could help guide your mobile strategy, the backend effects of moving from Rails to Node.js may also prove interesting. 

After evaluation, some of the advantages of Node.js were:

Click to read more ...

Tuesday
Oct022012

An Epic TripAdvisor Update: Why Not Run on the Cloud? The Grand Experiment.

This is a guest post by Shawn Hsiao, Luke Massa, and Victor Luu. Shawn runs TripAdvisor’s Technical Operations team, Luke and Victor interned on his team this past summer. This post is introduced by Andy Gelfond, TripAdvisor’s head of engineering.

It's been a little over a year since our last post about the TripAdvisor architecture. It has been an exciting year. Our business and team continues to grow, we are now an independent public company, and we have continued to keep/scale our development process and culture as we have grown - we still run dozens of independent teams, and each team continues to work across the entire stack. All that has changed are the numbers:

  • 56M visitors per month
  • 350M+ pages requests a day
  • 120TB+ of warehouse data running on a large Hadoop cluster, and quickly growing

We also had a very successful college intern program that brought on over 60 interns this past summer, all who were quickly on boarded and doing the same kind of work as our full time engineers.

One recurring idea around here is why not run on the cloud? Two of our summer interns, Luke Massa and Victor Luu, took a serious look at this question by deploying a complete version of our site on Amazon Web Services. Here, in their own words, and a lot of technical detail, is their story of what they did this past summer.

Running TripAdvisor on AWS

Click to read more ...

Tuesday
Oct022012

Sponsored Post: Akiban, Wiredrive, NY Times, CouchConf, FiftyThree, ROBLOX, Percona, ElasticHosts, ScaleOut, New Relic, NetDNA, GigaSpaces, AiCache, Logic Monitor, AppDynamics, CloudSigma

Who's Hiring?

  • Wiredrive is looking for a SENIOR WEB APPLICATION SYSTEMS ADMINISTRATOR and a TEST AUTOMATION ENGINEER to join our agile infrustructure team. For full job descriptions please see http://wdrv.it/QA6iTw
  • The New York Times is seeking a developer focused on infrastructure to join its newsroom development team. Read the full description here and send resumes to chadas@nytimes.com.
  • FiftyThree, the company behind the award-winning iPad app Paper, is looking for a {Backend || DevOps} Engineer to help us build our next great product: a service to "bring ideas together". http://www.fiftythree.com/jobs
  • New Relic is looking for a Java Scalability Engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter?  http://newrelic.com/about/jobs
  • Join the team at ROBLOX as a Senior Database Administrator and help us advance our rapidly growing gaming platform with over 30K web hits/sec, 75K+ database requests/sec, and over 1 petabyte of monthly CDN traffic. Sound cool? Apply here.

Fun and Informative Events

  • Integrating location-based capabilities doesn't have to be difficult or expensive. Join us for "How to create Geospatial Indexes for Nearest Neighbor and Geofencing queries in Akiban.Register Here
  • CouchConf is a one-day, three track event is for any developer who wants to take a 
  • deeper dive into Couchbase NoSQL technology, 
  • learn where it’s headed and build really cool stuff.
  • Percona announces MySQL training for busy professional: Developer Training for MySQL. Percona is offering savings of over 35% for this course in the month of August.

Cool Products and Services

  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free.  No sign-up required. http://aicache.com/deploy
  • ElasticHosts launches white-label cloud reseller program offering 30% revenue share on fully rebranded cloud hosting.
  • ScaleOut Software. In-memorry Data Grids for the Enterprise. Download a Free Trial.
  • Follow the Cloudify blog to learn more about our open source PaaS stack – latest integration recipes, builds, features, and other cool stuff.  Visit the GigaSpaces blog to learn how to take your application to the next level of scalability and performance.
  • NetDNA, a Tier-1 GlobalContent Delivery Network, offers a Dual-CDN strategy which allows companies to utilize a redundant infrastructure while leveraging the advantages of multiple CDNs to reduce costs.
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.

For a longer description of each sponsor, please read more below...

Click to read more ...

Friday
Sep282012

Stuff The Internet Says On Scalability For September 28, 2012

It's HighScalability Time:

  • Quotable Quotes:
    • @dbasch: The world is full of "scalability engineers" who would die from an orgasm if their software ever saw 10,000 requests in a day.
    • @mtnygard: “Scaling issues are always expressed as a queue backing up somewhere.” —@moonpolysoft #strangeloop
    • @rbranson: If your data fits in main memory, you're doing it wrong. #strangeloop
    • @peakscale: Using schemaless DBs an "overreaction" & "confuses the poor impl. of schemas with the value that schemas provide"
    • @adrianco: GM: Performance analysis is complicated by your brain thinking LINEARLY about a computer system that is NONLINEAR. 
    • @littleidea: it's better to have infinite scalability and not need it, than to need infinite scalability and not have it
  • Looks like Google is on the right track with their language understanding efforts. How hierarchical is language use: In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Sep262012

WordPress.com Serves 70,000 req/sec and over 15 Gbit/sec of Traffic using NGINX

This is a guest post by Barry Abrahamson, Chief Systems Wrangler at Automattic, and Nginx's Coufounder Andrew Alexeev.

WordPress.com serves more than 33 million sites attracting over 339 million people and 3.4 billion pages each month. Since April 2008, WordPress.com has experienced about 4.4 times growth in page views. WordPress.com VIP hosts many popular sites including CNN’s Political Ticker, NFL, Time Inc’s The Page, People Magazine’s Style Watch, corporate blogs for Flickr and KROQ, and many more. Automattic operates two thousand servers in twelve, globally distributed, data centers. WordPress.com customer data is instantly replicated between different locations to provide an extremely reliable and fast web experience for hundreds of millions of visitors.

Problem

Click to read more ...