Entries by HighScalability Team (1576)

Tuesday
Apr242012

Sponsored Post: Reality Check Network, Infragistics, Gigaspaces, AiCache, ElasticHosts, Logic Monitor, Attribution Modeling, New Relic, AppDynamics, CloudSigma, ManageEnine, Site24x7

Who's Hiring? 

  • Are you looking for people?

Fun and Informative Events

  • Sign up for this free 30-minute webinar exploring how new technology can determine which ads have been seen by users and will discuss the C3 Metrics Labs analysis of over 2 billion impressions. 

Cool Products and Services

  • Reality Check Network offers powerful hosting solutions and managed servers for high traffic/bandwidth websites backed by unlimited network, server and application support.
  • When you’re looking for the fastest, lightest, most complete toolset for rapidly building high performance Web 2.0 applications, you want NetAdvantage for ASP.NET.
  • Create your most stunning, highly performant, and completely mobile HTML5 applications and dashboards on any browser, platform or device – only with NetAdvantage for jQuery.
  • Take your application to the next level of performance & scalability with the GigaSpaces In-Memory Data Grid (IMDG)
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free.  No sign-up required. http://aicache.com/deploy
  • ElasticHosts award winning cloud server hosting launches across North America. Adding data centers in Los Angeles and Toronto. Free trial.
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

For a longer description of each sponsor, please read more below...

Click to read more ...

Friday
Apr202012

Stuff The Internet Says On Scalability For April 20, 2012

It's HighScalability Time:

  • 100 Billion PVs/WeeK: Plenty of Fish; 695k TPS: Node.js & VoltDB; 131ms: Response from EC2 EU
  • Quotable quotes:
    • Markus FrindScaling is not very fun these days.
    • Mike Krieger: Scaling is like replacing all components on a car while driving it at 100 mph.
    • @jaykreps: We built servers that could handle 100k connections by building servers that could handle 10k connections and waiting a decade.
    • @RichExperiences: Twitter alone generates more than 7 terabytes of data every day, Facebook 10 TB...
    • @HectorESoto: Scalability is about building wider roads, not about building faster cars.” – Steve Swartz
    • @jasongorman: Put our logic in the client and our data in the "cloud"? What does that remind me of? Micrsoft Visual Basic.
    • @jasobrown: #nasa's use of AWS/cloud is quite similar to netflix. long term storage = s3, spin up pre-make (baked) amis, using VPC for extended firewall
  • Netflix never used its $1 million algorithm due to engineering costs. Ease of implementation often trumps absolute measures of performance. That's keeping it simple.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Apr182012

Ansible - A Simple Model-Driven Configuration Management and Command Execution Framework

This is guest post by Michael DeHaan (@laserllama), a software developer and architect, on Ansible, a simple deployment, model-driven configuration management, and command execution framework.

I owe High Scalability a great deal of credit for the idea behind my latest software project. I was reading about how an older tool I helped create, Func, was used at Tumblr, and it kicked some ideas into gear. This article is about what happened from that idea.

My observation, which the article reinforced, was that many shops end up using a configuration management tool (Puppet, Chef, cfengine), a separate deployment tool (Capistrano, Fabric) and yet another separate ad-hoc task execution tool (Func, pssh, etc) because one class of tool historically hasn't been good at all three jobs.

My other observation (not from the article) was that the whole "infrastructure as code" movement, while revolutionary, and definitely great for many, was probably secretly grating on a good number of systems administrators. As a software developer, I myself can emphasize -- the software design/development/testing process is frequently painful, and I would rather think of infrastructure as being data-driven. Data is supposed to be simple, programs are often not. This is why I made Ansible.

Ansible: How is it Different?

Click to read more ...

Tuesday
Apr172012

YouTube Strategy: Adding Jitter isn't a Bug

The adding jitter strategy was one of the most commented on techniques from 7 Years Of YouTube Scalability Lessons In 30 Minutes on HackerNews. Probably because it’s one of the emergent phenomena that you really can’t predict and is shocking when you see it in real life. Here’s the technique:

Add Entropy Back into Your System

  • If your system doesn’t jitter then you get thundering herds. Distributed applications are really weather systems. Debugging them is as deterministic as predicting the weather. Jitter introduces more randomness because surprisingly, things tend to stack up.
  • For example, cache expirations. For a popular video they cache things as best they can. The most popular video they might cache for 24 hours. If everything expires at one time then every machine will calculate the expiration at the same time. This creates a thundering herd.
  • By jittering you are saying  randomly expire between 18-30 hours. That prevents things from stacking up. They use this all over the place. Systems have a tendency to self synchronize as operations line up and try to destroy themselves. Fascinating to watch. You get slow disk system on one machine and everybody is waiting on a request so all of a sudden all these other requests on all these other machines are completely synchronized. This happens when you have many machines and you have many events. Each one actually removes entropy from the system so you have to add some back in.

Comments from HackerNews really help to fill out the topic with more detail:

Click to read more ...

Monday
Apr162012

Instagram Architecture Update: What’s new with Instagram?

The fascination over Instagram continues and fortunately we have several new streams of information to feed the insanity. So consider this article an update to The Instagram Architecture Facebook Bought For A Cool Billion Dollars, based primarily on Scaling Instagram, a slide deck for an AirBnB tech talk given by Instagram co-founder, Mike Krieger. Several other information sources, listed at the bottom of the article, were also used.

Unfortunately we just have a slide deck, so the connective tissue of the talk is missing, but it’s still very interesting, in the same spirit of wisdom presentations we often see after developers come up for air after spending significant time spent in the trenches.

If you expect to dive deep into the technological details and find a billion reasons why Instagram was acquired, you will be disappointed. That magic can be found in the emotional investment in the relationship between all of the users and the product, not in the bits about how they bytes are managed.

So what’s new with Instagram?

Click to read more ...

Friday
Apr132012

Stuff The Internet Says On Scalability For April 13, 2012

It's HighScalability Time:

  • 50 million in 50 days : Draw Something downloads; 40 million concurrent users : Skype
  • Key to making sensors ubiquitous is getting the BOM cost down. Here's a dream way of making that happen: Bye-Bye Batteries: Radio Waves as a Low-Power Source. “Silicon technology has advanced to the point where even tiny amounts of energy can do useful work.” No batteries == cheaper, smaller products == ubiquity.
  • The MySQL “swap insanity” problem and the effects of the NUMA architecture. Jeremy Cole with a spectacular article on the differences between NUMA and SMP/UMA systems and the mostly unsatisfactory tricks required to get MySQL to perform on NUMA systems. There are really two issues: the evils of an OS controlled swap and NUMA performance effects due to a single node (in the NUMA sense) running out of memory. This is the kind of stuff you only see when you push your systems to the edge.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Tuesday
Apr102012

Sponsored Post: Infragistics, Reality Check Network, Gigaspaces, AiCache, ElasticHosts, Logic Monitor, Attribution Modeling, New Relic, AppDynamics, CloudSigma, ManageEnine, Site24x7

Who's Hiring? 

  • Are you looking for people?

Fun and Informative Events

  • Sign up for this free 30-minute webinar exploring how new technology can determine which ads have been seen by users and will discuss the C3 Metrics Labs analysis of over 2 billion impressions. 

Cool Products and Services

  • Reality Check Network offers powerful hosting solutions and managed servers for high traffic/bandwidth websites backed by unlimited network, server and application support.
  • When you’re looking for the fastest, lightest, most complete toolset for rapidly building high performance Web 2.0 applications, you want NetAdvantage for ASP.NET.
  • Create your most stunning, highly performant, and completely mobile HTML5 applications and dashboards on any browser, platform or device – only with NetAdvantage for jQuery.
  • Take your application to the next level of performance & scalability with the GigaSpaces In-Memory Data Grid (IMDG)
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free.  No sign-up required. http://aicache.com/deploy
  • ElasticHosts award winning cloud server hosting launches across North America. Adding data centers in Los Angeles and Toronto. Free trial.
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

For a longer description of each sponsor, please read more below...

Click to read more ...

Monday
Apr092012

The Instagram Architecture Facebook Bought for a Cool Billion Dollars

It's been a well kept secret, but you may have heard Facebook will Buy Photo-Sharing Service Instagram for $1 Billion. Just what is Facebook buying? Here's a quick gloss I did a little over a year ago on a presentation Instagram gave on their architecture. In that article I called Instagram's architecture the "canonical description of an early stage startup in this era." Little did we know how true that would turn out to be. If you want to learn how they did it then don't take a picture, just keep on reading... 

Click to read more ...

Monday
Apr092012

Why My Slime Mold is Better than Your Hadoop Cluster

In After Life: The Strange Science Of Decay there’s a truly incredible sequence of gorgeously shot video showing how creeping slime mold solves mazes and performs other other amazing feats of computation. Take a look at what simple one celled organisms can do:



The whole video is really well done and shockingly revelatory. It’s the story of decay, how atoms created during the Big Bang and through countless supernova explosions are continually rearranged and reused by the complex process of life.

The most glaring take away for me was how sterile is the world of computing.

In the real-world material is plentiful and follows basic physical and chemical rules for self-assembly. Inside cells chemicals and bits of RNA and DNA whiz around each other at amazing rates of speed. They crash into each other and if there's a fit a reaction takes place and something larger is created. And the process continues until larger and larger structures are built. All without organization. Everything happens because there are elements with basic properties, supplies of those elements, ways of those elements coming into contact with each other, ways for those elements to combine together, and ways for those elements to be torn apart and recycled.

Code, which you would think would be the most alive part of our systems, is as dead as statues in a church. Code is carefully constructed, deployed, groomed, and maintained. Code is no more alive than a golem.

Data is far worse off than code. Data has no independent reality or opportunity for serendipitous creation. Data is dead. A data physics, an attempt to give life to data in the same way elements in the table elements come alive when brought together in a physical world. Imbue data with a vitality and create a physical universe in which they can combine according to their nature. That’s what we need to make computing come alive.

The Circle of Life

Click to read more ...

Friday
Apr062012

Stuff The Internet Says On Scalability For April 6, 2012

It's HighScalability Time:

  • Exascale Supercomputer: how IBM plans to understand data from a universe of light;  905 Billion Objects and 650,000 Requests/Second: S3; 64-cores: PostgreSQL shows linear read scalability;
  • Quotable quotes:
    • pkaler: Programming is hard. Scaling is harder.
    • @crucially: As far as I can tell, openstack is what happens when ops people write code.
    • @DEVOPS_BORAT: Goal of sysadmin is replace itself with small shell script. Goal of devops is replace itself with small REST API.
    • @fowlduck: ec2, where dynamic scalability means them running out of instances :(
    • hcarvalhoalves: You know what is amazing? Is that as soon you hit bigger or more general problems, you always face the compromise of "trading X resource for accuracy". Which leads me to believe that software, so far, has only been deterministic by pure accident.
    • Geva Perry: In the Game of Clouds, You Win Or You Die: CloudStack
  • Exclusive: a behind-the-scenes look at Facebook release engineering. Ryan Paul with a fascinating blow by blow of a Facebook software release: Facebook's entire code base is compiled down to a single 1.5GB  binary executable...
  • MongoDb Architecture. Ricky Ho with an epic look at the finer details of how MongoDb web scales. Covers: Major difference from RDBMS, Query processing, Storage Model, Data update and Transaction, Replication Model, Sharding Model, Map/Reduce Execution. Conclusion is MongoDb is very powerful and easy to use.
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...