Entries by HighScalability Team (1576)

Tuesday
Mar012011

Sponsored Post: ScaleOut, aiCache, WAPT, Karmasphere, Kabam, Opera Solutions, Newrelic, Cloudkick, Membase, Joyent, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

Cool Products and Services

  • ScaleOut StateServer - Scale Out Your Server Farm Applications!
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. 
  • WAPT is a load, stress and performance testing tool for websites and web-based applications.
  • Karmasphere is bringing Apache Hadoop power to developers and analysts. Download your Free Community Edition today!
  • Newrelic - What are you doing to ensure the performance of your apps?
  • Cloudkick - monitor & manage your servers better with a FREE Cloudkick developer account.
  • Learn how two game developers prepared for rapid user growth in this recorded Joyent webinar: http://bit.ly/hzBoib.
  • CloudSigma. Instantly scalable European cloud servers.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

Click to read more ...

Thursday
Feb242011

Strategy: Eliminate Unnecessary SQL

MySQL Expert Ronald Bradford explains how one key way to improve the scalability of a MySQL server, and undoubtedly nearly every other server, is to eliminate unnecessary SQL, saying the most efficient way to improve an SQL statement is to eliminate it:

The MySQL kernel can only physically process a certain number of SQL statements for a given time period (e.g. per second). Regardless of the type of machine you have, there is a physical limit. If you eliminate SQL statements that are unwarranted and unnecessary, you automatically enable more important SQL statements to run. There are numerous other downstream affects, however this is the simple math. To run more SQL, reduce the number of SQL you need to run.

Ronald shows how to use mk-query-digest to look at query execution times and determine which ones can be profitably whacked. 

Click to read more ...

Wednesday
Feb232011

This stuff isn't taught, you learn it bit by bit as you solve each problem.

"For the things we have to learn before we can do them, we learn by doing them." -- Aristotle

A really nice Internet moment happened in the HackerNews thread Disqus: Scaling the World’s Largest Django Application, when David Kitchen crafted an awesome response to a question about how you learn to build scalable systems. It's so good I thought I would reproduce it here.

Question: asked by grovulent:

Not like this is a problem I have to worry about. But where on earth does one learn this stuff? The talk is useful - as an overview of what they use - but I know nothing of how to implement a single step.

Answer: answered by David Kitchen of buro9:

Click to read more ...

Tuesday
Feb222011

Is Node.js Becoming a Part of the Stack? SimpleGeo Says Yes.

This is an interview with Wade Simmons, an Infrastructure Engineer at SimpleGeo, a service making it easy for developers to create location-aware applications, on their increasing use of Node.js as a backend service component, replacing code that would have at one time been written in Java, Python or Ruby. Node.js is finding it's way into many stacks these days and I was curious why that might be. My experience writing several messaging systems is that programmers don't like the async model and it's a big surprise that a pure async programming model like Node.js, especially one that uses server-side Javascript, would be taking off. Wade was generous enough to help explain their reasoning behind using Node.js at SimpleGeo. I'd really like to thank Wade for taking the time for this interview. He did a really great job and provided a lot of insight on how the modern web stack is evolving in the crucible of real-life experience.  

And here begins the interview with Wade Simmons:

Click to read more ...

Friday
Feb182011

Stuff The Internet Says On Scalability For February 18, 2011

Submitted for your reading pleasure on this cold and rainy Friday...

Click to read more ...

Wednesday
Feb162011

Paper: An Experimental Investigation of the Akamai Adaptive Video Streaming

Video is hot on the Internet and people are really interested in knowing how to make it work. Dan Rayburn has a post pointing to a fascinating paper: An Experimental Investigation of the Akamai Adaptive Video Streaming, which talks in some detail about the protocols big players like YouTube, Skype and Akamai use to serve video over on an inherently video unfriendly medium like the Internet. For Akamai they found:

  1. Each video is encoded in five versions at different bit rates and stored in separate files.
  2. The client sends commands to the server with an average inter departure time of about 2 s, i.e. the control algorithm is executed on average each 2 seconds. 
  3. Akamai uses only the video level to adapt the video source to the available bandwidth, whereas the frame rate of the video is kept constant.
  4. When a sudden drop in the available bandwidth occurs, short interruptions of the video playback can occur due to the a large actuation delay.
  5. For a sudden increase of the available bandwidth, the transient time to match the new bandwidth is roughly 150 seconds.

Abstract:

Click to read more ...

Tuesday
Feb152011

Wordnik - 10 million API Requests a Day on MongoDB and Scala

Wordnik is an online dictionary and language resource that has both a website and an API component. Their goal is to show you as much information as possible, as fast as we can find it, for every word in English, and to give you a place where you can make your own opinions about words known. As cool as that is, what is really cool is the information they share in their blog about their experiences building a web service. They've written an excellent series of articles and presentations you may find useful:
  • What has technology done for words lately?
    • Eventual consistency. Using an eventually consistent model they can do work in parallel and we count as many words as possible when we can, and add them all up when there’s a lag. The count’s always in the ballpark, and we never have to stop.D
    • Document-oriented storage. Dictionary entries are more naturally modeled as hierarchical documents and using that model has made it quicker to find data and is easier for development.

Click to read more ...

Tuesday
Feb152011

Sponsored Post: Karmasphere, Kabam, Opera Solutions, Percona, Appirio, Newrelic, Cloudkick, Membase, EA, Joyent, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

Fun and Informative Events

  • Percona Live to be held in San Francisco February 16th, 2011. A one day event run by the experts behind the MySQL Performance Blog.
  • A new round of Membase meetups have been planned for January 2011 for San Diego, Denver, Seattle, Vancouver and Chicago.

Cool Products and Services

Click to read more ...

Friday
Feb112011

Stuff The Internet Says On Scalability For February 11, 2011

Submitted for your reading pleasure...

  • A good night's sleep is why Facebook CTO Bret Taylor says Friendfeed should have gone cloud, let others take the midnight watch, even if it costs a bit more.
  • James Urquhart with an information packed interview on a wide range of cloud topics: Cloud Expert Inside theCube at Stata Conference. Highlights: cloud is an operations model, it is not a technology, it is a way to apply technology to problems; the faster you can get the resources into the hands of the people who use it the more money you save overall; cloud is a cash flow story, not a savings story; services aren't about servers or storage, they are about applications.
  • Quotable Quotes:
    • Ryan Tomayko: Frameworks don’t solve scalability problems, design solves scalability problems. Via @GregSkloot
    • @zuno: The golden rule of scalability: "it can probably wait" look for other areas to save resources.
    • @bihzad: joinserv hit a scalability wall, but I'm pretty sure I can climb over it with multiprocessing

Click to read more ...

Tuesday
Feb082011

Mollom Architecture - Killing Over 373 Million Spams at 100 Requests Per Second

Mollom is one of those cool SaaS companies every developer dreams of creating when they wrack their brains looking for a viable software-as-a-service startup. Mollom profitably runs a useful service—spam filtering—with a small group of geographically distributed developers. Mollom helps protect nearly 40,000 websites from spam, including one of mine, which is where I first learned about Mollom. In a desperate attempt to stop spam on a Drupal site, where every other form of CAPTCHA had failed miserably, I installed Mollom in about 10 minutes and it immediately started working. That's the out of the box experience I was looking for.

From the time Mollom opened its digital inspection system they've rejected over 373 million spams and in the process they've learned that a stunning 90% of all messages are spam. This spam torrent is handled by only two geographically distributed machines that handle 100 requests/ second, each running a Java application server and Cassandra. So few resources are necessary because they've created a very efficient machine learning system. Isn't that cool? So, how do they do it?

Click to read more ...