Entries by HighScalability Team (1576)

Tuesday
Dec272011

PlentyOfFish Update - 6 Billion Pageviews and 32 Billion Images a Month

Markus has a short update on their PlentyOfFish Architecture. Impressive November statistics:

  • 6 billion pageviews served
  • 32 billion images served
  • 6 million logins in one day
  • IM servers handle about 30 billion pageviews
  • 11 webservers (5 of which could be dropped)
  • Hired first DBA in July. They currently have a handful of employees.
  • All hosting/cdn costs  combined are under $70k/month.

Lesson: small organization, simple architecture, on raw hardware is still plenty profitable for PlentyOfFish.

Click to read more ...

Friday
Dec232011

Stuff The Internet Says On Scalability For December 23, 2011

A merry HighScalability to all and to all a good night:

  • Santa: 3.7 million appointments; iPad2 == 1986 Cray 2 6 processor super computer; Watson: 200 million pages of natural language content 
  • Funny: a cautionary tale about storage and backupWhere is my data? I’m kinda big deal after all! I should have listened to my postdoc, he can build cheaper storage than you can.
  • Nothing stirs up more energy than when someone says they are abandoning and old beloved framework for a newer sexier model. Feelings of betrayal and abandonment leak over everything, which is always a good draw for reality social networking. Here Paul Querna tells why The Switch: Python to Node.js. And here we see the response on Hacker News. Python got the job done for cloudkick, they were acquired, but they wanted something more going forward, a trophy wife if you will, after the first wife put them through law school. Good discussion all around. You may find something that helps in your own platform decision. Or you just may find it entertaining. 
For much more on what the Internet has to Say on Scalability, please read below...

Click to read more ...

Friday
Dec232011

Funny: A Cautionary Tale About Storage and Backup

Storage: everyone wants it, but nobody wants to pay for it. This is one of those funny cartoons showing the doom of that sort of thinking...pay me now or cry over lost data later.

Loved: Where is my data? I’m kinda big deal after all! I should have listened to my postdoc, he can build cheaper storage than you can. I should have went to Newegg.

Found via Joe Landman and James Cuff. Joe talks about how RAID isn't backup and has some more wit and wisdom.

Monday
Dec192011

How Twitter Stores 250 Million Tweets a Day Using MySQL

Jeremy Cole, a DBA Team Lead/Database Architect at Twitter, gave a really good talk at the O'Reilly MySQL conference: Big and Small Data at @Twitter, where the topic was thinking of Twitter from the data perspective.

One of the interesting stories he told was of the transition from Twitter's old way of storing tweets using temporal sharding, to a more distributed approach using a new tweet store called T-bird, which is built on top of Gizzard, which is built using MySQL.

Twitter's original tweet store:

Click to read more ...

Friday
Dec162011

Stuff The Internet Says On Scalability For December 16, 2011

A HighScalability is forever:

  • eBay: tens of millions of lines of code; Google code base change rate per month: 50%; Apple: 100 million downloads, Internet: 186 Gbps
  • Quotable quotes:
    • @OttmarAmann : Scalability is not as important as managing complexity 
    • @amankapur91 : Does scalability imply standardization, and then does standardization imply loss of innovation?
  • Why wireless mesh networks won’t save us from censorship. Shaddi Hasan harshes the buzz on the utopian vision of a darknet freeing us from a SOPA/RIAA/everything tyranny. The reasons: Management is hard and expensive; Omni-directional antennas suck; Single-radio equipment doesn’t work; multi-radio equipment is very expensive; Your RF tricks won’t help you here; Unplanned mesh networks break routing. My take: what can't be routed around must be crushed.
You know you want to read more Stuff the Internet has to say on Scalability, so read more below...

Click to read more ...

Tuesday
Dec132011

Sponsored Post: Cedexis, Callfire, Attribution Modeling, Logic Monitor, New Relic, ScaleOut, AppDynamics, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

  • Callfire, one of the largest cloud telephony platforms on the web, is hiring a Sr. Software Engineer. You can learn more here.

Fun and Informative Events

  • Sign up for this free 30-minute webinar exploring how new technology can determine which ads have been seen by users and will discuss the C3 Metrics Labs analysis of over 2 billion impressions. 

Cool Products and Services

  • Not satisfied with performance in the cloud? Visit Cedexis and Lose the Wait. Looking for a path to the Hybrid Cloud?  Cedexis can help you find the right path.
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
  • ScaleOut StateServer® Delivers Map/Reduce Analysis and Scalable Application Performance. Gain competitive advantage with rapid access to business intelligence. Download a free evaluation trial today.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Instantly scalable European cloud servers.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

For a longer description of each sponsor, please read more below...

Click to read more ...

Monday
Dec122011

Netflix: Developing, Deploying, and Supporting Software According to the Way of the Cloud

At a Cloud Computing Meetup, Siddharth "Sid" Anand of Netflix, backed by a merry band of Netflixians, gave an interesting talk: Keeping Movies Running Amid Thunderstorms. While the talk gave a good overview of their move to the cloud, issues with capacity planning, thundering herds, latency problems, and simian armageddon, I found myself most taken with how they handle software deployment in the cloud.

I've worked on half a dozen or more build and deployment systems, some small, some quite large, but never for a large organization like Netflix in the cloud. The cloud has this amazing capability that has never existed before that enables a novel approach to fault-tolerant software deployments: the ability to spin up huge numbers of instances to completely run a new release while running the old release at the same time.

The process goes something like: 

Click to read more ...

Friday
Dec092011

Stuff The Internet Says On Scalability For December 9, 2011

It takes a licking and keeps on HighScalabilitying:

For even more Stuff the Internet Says on Scalability, click the down below...

Click to read more ...

Thursday
Dec082011

Update on Scalable Causal Consistency For Wide-Area Storage With COPS

Here are a few updates on the article Paper: Don’t Settle For Eventual: Scalable Causal Consistency For Wide-Area Storage With COPS from Mike Freedman and Wyatt Lloyd.

Q: How software architectures could change in response to casual+ consistency?

A: I don't really think they would much. Somebody would still run a two-tier architecture in their datacenter:  a front-tier of webservers running both (say) PHP and our client library, and a back tier of storage nodes running COPS.  (I'm not sure if it was obvious given the discussion of our "thick" client -- you should think of the COPS client dropping in where a memcache client library does...albeit ours has per-session state.)

Q: Why not just use vector clocks?

Click to read more ...

Tuesday
Dec062011

Instagram Architecture: 14 Million users, Terabytes of Photos, 100s of Instances, Dozens of Technologies

Instagram is a free photo sharing and social networking service for your iPhone that has been an instant success. Growing to 14 million users in just over a year, they reached 150 million photos in August while amassing several terabytes of photos, and they did this with just 3 Instaneers, all on the Amazon stack.

The Instagram team has written up what can be considered the canonical description of an early stage startup in this era: What Powers Instagram: Hundreds of Instances, Dozens of Technologies.

Instagram uses a pastiche of different technologies and strategies. The team is small yet has experience rapid growth riding the crest of a rising social and mobile wave, it uses a hybrid of SQL and NoSQL, it uses a ton of open source projects, they chose the cloud over colo, Amazon services are highly leveraged rather than building their own, reliability is through availability zones, async work scheduling links components together, the system is composed as much as possible of services exposing an API and external services they don't have to build, data is stored in-memory and in the cloud, most code is in a dynamic language, custom bits have been coded to link everything together, and they have gone fast and kept small. A very modern construction.

We'll just tl;dr the article here, it's very well written and to the point. Definitely worth reading. Here are the essentials: 

Click to read more ...