Entries by HighScalability Team (1576)

Friday
Feb082013

Stuff The Internet Says On Scalability For February 8, 2013

Hey, it's HighScalability time:

  • 34TB : storage for GitHub search; 2,880,000,000: log lines per day
  • Quotable Quotes:
    • @peakscale: The "IKEA effect"  << Contributes to NIH and why ppl still like IaaS over PaaS. :-\
    • @sheeshee: module named kafka.. creates weird & random processes, sends data from here to there & after 3 minutes noone knows what's happening anymore?
    • @sometoomany: Ceased writing a talk about cloud computing infrastructure, and data centre power efficiency. Bored myself to death, but saved others.

  • Lots of heat on Is MongoDB's fault tolerance broken? Yes it is. No it's not. YES it is. And the score: MongoDB Is Still Broken by Design 5-0.

  • Every insurgency must recruit from an existing population which is already affiliated elsewhere. For web properties the easiest group to recruit is the younger demographic. They naturally want something different than their elders and they have fewer allegiances to defend. What's your counter insurgency strategy?

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Feb062013

Super Bowl Advertisers Ready for the Traffic? Nope..It's Lights Out.

Advertising for the Super Bowl is bigger than the game for many viewers. So you gotta figure advertisers are ready for the traffic bursts generated by their expensive ads? Not exactly...

Yottaa reports an amazing 13 advertiser websites crashed during the Super Bowl. Coke was interactively au currant, asking viewers to vote for the ending of a commercial, but load times went to 62 seconds. SodaStream, Calvin Klein, Axe, Got Milk? The Walking Dead, many movie sites, and many car sites, all were flagged with delay of fame penalties.

Lots of time, money, and creative energy is spent lovingly perfecting every detail of these commercials. It won't be a surprise to any programmer that this can't usually be said of the follow through on the backend.

So what can you do? Yottaa has some good tips and Michael Hamrah has a wonderful post on dealing with the Super Bowl Burst Problem:

Click to read more ...

Tuesday
Feb052013

Sponsored Post: Amazon, Zoosk, aiCache, Teradata Aster, Aerospike, Percona, ScaleOut, New Relic, NetDNA, Logic Monitor, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • The AWS Relational Database Service (RDS) automates management of relational databases in the cloud. We have a wide variety of customers and are part of many mission-critical applications, like the ones built by the 2012 Obama re-election campaign. If you're interested in joining a fast-growing service and team, please send your resume to rds-jobs@amazon.com.
  • Hiring! Director of Site Operations at Zoosk.  We’re looking for an innovator. Someone who wants to take site operations along with a smart team of Sys Admins to the next level. This is a very hands-on leadership role in a high-availability production environment. Full details here. 
  • Teradata Aster is looking for Distributed Systems, Analytic Applications,  and Performance Architects. As a member of the Architecture Group you will help define the technical roadmap for the product.
  • The New York Times is seeking a developer focused on infrastructure to join its newsroom development team. Read the full description here and send resumes to chadas@nytimes.com.
  • New Relic is looking for a Java Scalability Engineer in Portland, OR. Ready to scale a web service with more incoming bits/second than Twitter?  http://newrelic.com/about/jobs
  • Aerospike is Hiring! You dream in C - and like it? Then join us as a Senior Distributed Systems Engineer or Client / Application Engineer. People covent your bag of tricks for troubleshooting systems and network issues? Join our Operations and QA team. See if these positions are a fit for you! 

Fun and Informative Events

Cool Products and Services

  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free. No sign-up required. http://aicache.com/deploy
  • New Benchmark shows Aerospike nearly 10x Faster than the Competition. Thumbtack Technology YCSB Benchmark shows Aerospike nearly 20x faster than Cassandra, Couchbase and Mongodb. Read it now!
  • ScaleOut Software. In-memorry Data Grids for the Enterprise. Download a Free Trial.
  • NetDNA, a Tier-1 GlobalContent Delivery Network, offers a Dual-CDN strategy which allows companies to utilize a redundant infrastructure while leveraging the advantages of multiple CDNs to reduce costs.
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Feb042013

Is Provisioned IOPS Better? Yes, it Delivers More Consistent and Higher Performance IO

Amazon created a whole new class of service with their Provisioned IOPS for RDS, EBS, and DynamoDB. The idea is simple. If you want more performance, you turn a dial up. If you want less, you turn a dial down. A beautifully simple model. You pay for the performance you want, which is different than their previous cloud model, where performance varied, but you paid only for what you used. 

The question: Do these higher priced services really work better?

Rodrigo Campos put this question to the test (only for EBS) by running a benchmark he describes in IOMelt Provisioned IOPS EBS Benchmark Results - December 2012.

The result? Yes, AWS Provisioned IOPS Volumes Really Deliver More Consistent and Higher Performance IO:

Click to read more ...

Friday
Feb012013

Stuff The Internet Says On Scalability For February 1, 2013

Hey, it's HighScalability time:

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Jan302013

Better Browser Caching is More Important than No Javascript or Fast Networks for HTTP Performance

Performance guru Steve Souders gave his keynote presentation, Cache is King! (slides), at the HTML5DevCon, besides being an extremely clear explanation of how caching works on the Internet and how to optimize your use of HTTP to get the best performance, Steve ran experiments that found some surprising results on what gave the best web site performance improvements.

In his base line test, page loads took 7.65 seconds (median of three runs). What change--Fast Network, No Javascript, or Primed Cache--would make the biggest performance improvement? It was Primed Cache.

  • Fast Network - Using a fast FIOS network the load time was 4.13 seconds. Steve was surprised how big a difference this made, given how much work must happen in the browser. 
  • No JavaScript - 4.74 seconds after disabling JavaScript. Both reduces transfers and skips parsing by the browser. Steve thought the effect would have been larger.
  • Primed Cache - 3.46 seconds using a warm cache, less than half than the empty cache page view time because it reduced the number of HTTP requests and reduced the total transfer times. Key for mobile where higher latencies are common.

The implication being that caching is important so you must understand how HTTP caching works and how to make the best use of it. That's the rest of the talk.

Some key takeaways: 

Click to read more ...

Monday
Jan282013

DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing

This is an interview with Gabriel Weinberg, founder of Duck Duck Go and general all around startup guru, on what DDG’s architecture looks like in 2012.

Innovative search engine upstart DuckDuckGo had 30 million searches in February 2012 and averages over 1 million searches a day. It’s being positioned by super investor Fred Wilson as a clean, private, impartial and fast search engine. After talking with Gabriel I like what Fred Wilson said earlier, it seems closer to the heart of the matter: We invested in DuckDuckGo for the Reddit, Hacker News anarchists.
                  
Choosing DuckDuckGo can be thought of as not just a technical choice, but a vote for revolution. In an age when knowing your essence is not about about love or friendship, but about more effectively selling you to advertisers, DDG is positioning themselves as the do not track alternative, keepers of the privacy flame. You will still be monetized of course, but in a more civilized and anonymous way. 

Pushing privacy is a good way to carve out a competitive niche against Google et al, as by definition they can never compete on privacy. I get that. But what I found most compelling is DDG’s strong vision of a crowdsourced network of plugins giving broader search coverage by tying an army of vertical data suppliers into their search framework. For example, there's a specialized Lego plugin for searching against a complete Lego database. Use the name of a spice in your search query, for example, and DDG will recognize it and may trigger a deeper search against a highly tuned recipe database. Many different plugins can be triggered on each search and it’s all handled in real-time.

Can’t searching the Open Web provide all this data? No really. This is structured data with semantics. Not an HTML page. You need a search engine that’s capable of categorizing, mapping, merging, filtering, prioritizing, searching, formatting, and disambiguating richer data sets and you can’t do that with a keyword search. You need the kind of smarts DDG has built into their search engine. One problem of course is now that data has become valuable many grown ups don’t want to share anymore.

Being ad supported puts DDG in a tricky position. Targeted ads are more lucrative, but ironically DDG’s do not track policies means they can’t gather targeting data. Yet that’s also a selling point for those interested in privacy. But as search is famously intent driven, DDG’s technology of categorizing queries and matching them against data sources is already a form of high value targeting.

It will be fascinating to see how these forces play out. But for now let’s see how DuckDuckGo implements their search engine magic...

Information Sources

Click to read more ...

Friday
Jan252013

Stuff The Internet Says On Scalability For January 25, 2013

Sorry, Stuff the Internet Says has been called on the account of a power outage. Gods of rain and tree have interfered with thee. Instead, how about watching a little Python? (that's Monty, not the language)

Thursday
Jan242013

NoSQL Parody: say No! No! and No!

While certainly not in the same class as Hilarious Video: Relational Database vs NoSQL Fanbois or NSFW: Hilarious Fault-Tolerance Cartoon, this parody does have some really good moments:

Wednesday
Jan232013

Building Redundant Datacenter Networks is Not For Sissies - Use an Outside WAN Backbone

Ivan Pepelnjak, in his short and information packed REDUNDANT DATA CENTER INTERNET CONNECTIVITY video, shows why networking as played at the highest levels is something you want to leave to professionals, like a large animal country vetenarian delivering a stuck foal at 2AM on a dark and stormy night. 

There are always a lot questions about the black art of building redundant datacenter networks and there's a shortage of accessible explanations. What I liked about Ivan's video is how effortlessly he explains the issues and tradeoffs you can expect in designing your own solution, as well as giving creative solutions to those problems. A lot of years of experience are boiled down to a 17 minute video.

Ivan begins by showing what a canonical fully redundant datacenter would look like:

It's like an ark where everything goes two by two...

Click to read more ...