High Scalability -

6 Comments |

Permalink |

Example,

Machine Learning

Friday

Jul272012

Stuff The Internet Says On Scalability For July 27, 2012

Friday, July 27, 2012 at 9:15AM

It's HighScalability Time:

Almost 1 Billion Users: Facebook; 30,000 connections across 94 locations: the Olympics Network; 2.5 quintillion: bytes of data created each day; 80K QPS: MemSQL.
In some early results Zencoder found EC2 was faster than GCE in their video transcoding tests, saying Google needs larger instances with faster CPUs. Love how Google's jbeda said they would take a look at the results. +1 for competition and benchmarks. Something to keep in mind is for Google a core means: a hyperthread per virtual CPU. So that means that a n1-standard-8 instance gets 4 physical cores and 8 hyperthreads, not 8 physical cores.
Kevin Rose recommends hiring generalists rather than developers with niche skills; don't give away your company; and thinks advisors should be investors. Founders should also probably stick around and managers shouldn't blame developers.
Is MemSQL the world's fastest database? BS meter on high, but it is created by two former Facebook engineers, who should know their MySQL. And recall Cassandra started out of Facebook. The magic sauce of this MySQL compatible database is it operates in-memory and optimizes queries by translating SQL into C++.
That pesky power law ruins everything: 20% of iOS applications make 97% of the revenue. Directly related to The Sparrow Problem where low software prices make it hard to build standalone software companies, which makes spirit killing buyouts the goal.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

1 Comment |

Permalink |

hot links

Wednesday

Jul252012

Vertical Scaling Ascendant - How are SSDs Changing Architectures?

Wednesday, July 25, 2012 at 9:17AM

With Amazon announcing new High I/O 2TB SSD instances the age of SSD has almost arrived. I say almost because the $27K a year price tag for the hi1.4xlarge on demand instance type is outside the budget of many. Yet even at the full on demand rate the price per IOP for the high IO instance is attractive: 27 cents ($27K/100K IOPS) per vs $1.25 for disk. With the obvious benefits of giant SSD machines combined with 10 Gbps networking, it’s interesting to consider: what architecture decisions might you make differently in the future?

More Headroom for Vertical Scaling Simplifies Everything

The beauty of higher hardware performance is it shifts effort away from the programmer which allows developers to focus on the business of business, minimizing trickeration. This has always been the allure of vertical scaling and is well realized by SSDs through a combination of high throughput, low latencies, and just as important, high densities.

We have a few early examples showing the performance punch of the new High IO instance:

3 Comments |

Permalink |

Wednesday

Jul252012

Who's Hiring?

Are you a seasoned systems admin? ground(ctrl) is looking for you! http://groundctrl.com/#!/jobs/senior-systems-administrator
Torbit is hiring! Care about performance? Care about making the internet faster and better? At Torbit we use lots of Golang, Node.js, JavaScript and PHP to solve big challenges.

Cool Products and Services

ElasticHosts launches white-label cloud reseller program offering 30% revenue share on fully rebranded cloud hosting.
Atlantic.Net with industry leading cloud servers backed by ultra-fast 40 Gigabits 4x Quad Rate Infiniband speeds, high throughput, low latency and newest RDMA technology. Free Trial Offer!
ScaleOut Software. In-memorry Data Grids for the Enterprise. Download a Free Trial.
Take your application to the next level of performance & scalability with the GigaSpaces In-Memory Data Grid (IMDG).
New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
NetDNA, a Tier-1 GlobalContent Delivery Network, offers a Dual-CDN strategy which allows companies to utilize a redundant infrastructure while leveraging the advantages of multiple CDNs to reduce costs.
aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free. No sign-up required. http://aicache.com/deploy
LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.
ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
www.site24x7.com : Monitor End User Experience from a global monitoring network.

Fun and Informative Events

Your very special event could be here.

For a longer description of each sponsor, please read more below...

Ask HighScalability: How Do I Build My MegaUpload + Itunes + YouTube Startup?

Monday, July 23, 2012 at 10:00AM

This question was sent in by Val, who asking for a little help in creating the next big thing. Any ideas?

I'm planning to run my own, first startup website and have been surfing the webs for relevant info to plan the technology I will use for it (the frontend and the backend, including the software and the hardware). The website will be something like a combination of:

MegaUpload (users will upload their files)
iTunes (users will be paid for their uploads)
and YouTube (in the future I'm planning to let users watch/listen to the content online, without downloading).

I don't have any investors yet, nor the budget - I'm still preparing the idea and I'm going to create first implementation (an "alpha version") before I show it to potential investors. Hence the initial technologies have to be extremely cheap *but* also highly scalable in the future so that I don't have to redo anything when the website grows.

Unfortunately I don't have much experience in running big wesites but, on the other hand, I hope my website to grow extremely big (of course).

My questions are:

17 Comments |

Permalink |

AskHighScalability

Monday

Jul232012

State of the CDN: More Traffic, Stable Prices, More Products, Profits - Not So Much

Monday, July 23, 2012 at 9:16AM

CDNs (content delivery networks) are the secret shadow super powers behind the web and Dan Rayburn at streamingmedia.com is the go to investigative reporter for quality information on CDNs. Every year Dan has a Content Delivery Summit on all things CDN and those videos are now available. Dan also gives a kind of state of the industry talk where he does something wonderful, he gives real numbers and prices. Dan really knows his stuff and is an excellent speaker, so watch the video, but here’s my gloss on the state of the CDN so far this year:

Massive growth. Large customers are expecting 126% growth in video traffic over last year; medium size customers are seeing 48% traffic growth, small sized customer are seeing 73.3% traffic growth.
More traffic != More profit...

1 Comment |

Permalink |

CDN

Friday

Jul202012

Stuff The Internet Says On Scalability For July 20, 2012

Friday, July 20, 2012 at 9:15AM

It's HighScalability Time:

4 Trillion Objects: Windows Azure Storage
Quotable Quotes:
- @benjchristensen: “What if we could make the data dense and cheap instead of sparse and expensive?” James Gosling @liquidrinc
- @sinetpd360: People trying new things and sharing is what helps create scalability. Jim Rickabaugh #siis2012
- @rbranson: This h1.4xlarge running 160GB PostgreSQL database pushing ~17,200 index scan rows/sec. r_await is 0.79ms, box is 92% idle.
- @sturadnidge: faster net and disk greatly reduces repair time and impact so we can load up the instances with far more dat
With Amazon announcing 2TB SSD instances the age of SSD has almost arrived. Netflix has already published a very thorough post on the wonderfulness of SSD for both performance and taming the long latency tail. They see 100K IOPS or 1GByte/sec on a untuned system. Netflix projects: The hi1.4xlarge configuration is about half the system cost for the same throughput; The mean read request latency was reduced from 10ms to 2.2ms; The 99th percentile request latency was reduced from 65ms to 10ms. Vertical scaling gets a huge boost as the bottleneck will likely move from IO to CPU again. Software will need a rewrite to be SSD optimized. Think about removing caching layers. Think reserved instances to bring down the cost. Think putting hot data into SSDs. We'll also see pressure to fix TCP interrupt bottlenecks and IRQ affinity problems.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation

Wednesday, July 18, 2012 at 9:38AM

GraphChi uses a Parallel Sliding Windows method which can: process a graph with mutable edge values efficiently from disk, with only a small number of non-sequential disk accesses, while supporting the asynchronous model of computation.

The result is graphs with billions of edges can be processed on just a single machine. It uses a vertex-centric computation model similar to Pregel, which supports iterative algorithims as apposed to the batch style of MapReduce. Streaming graph updates are supported.

About GraphChi, Carlos Guestrin, codirector of Carnegie Mellon's Select Lab, says:

A Mac Mini running GraphChi can analyze Twitter's social graph from 2010—which contains 40 million users and 1.2 billion connections—in 59 minutes. "The previous published result on this problem took 400 minutes using a cluster of about 1,000 computers

Strategy: Kill Off Multi-tenant Instances with High CPU Stolen Time

Wednesday, July 18, 2012 at 9:23AM

Are all instances created equal? Perhaps because under multi-tenancy multiple virtual machines run on the same physical host, not all applications will run equally well on every instance. In that case it makes sense to measure and move to a better performing instance.

That's the interesting idea from @botchagalupe:

Imagine something like a "performance monkey" where an infrastructure is so bound that it can kill lower performing instances automatically.

@adrianco says Netflix has throught of doing the same:

We've looked at killing off multi-tenant instances that have high CPU stolen time...

2 Comments |

Permalink |

Strategy,

amazon

Friday

Jul132012

Stuff The Internet Says On Scalability For July 13, 2012

Friday, July 13, 2012 at 9:15AM

It's HighScalability Time (Good luck today):

A Friday the 13th Postmorterama:
- James Hamilton with some high powered perspective on the report for the Fukushima Nuclear Accident. Apparently they haven't heard of the blameless post-mortem. Lots of interesting stuff, but this is a potentially disaster saving general lesson learned: operators can’t figure out what is happening or take appropriate action without detailed visibility into the state of the system.
- Evernote with a nicely detailed note on a recent outage. A kernel panic happened while upgrading two new “shard” servers with 3x as much RAM, SSDs instead of 15krpm disks, bonded networking, and an updated kernel. They had to revert and shite loves to happen when other shite happens.
- Heroku with their postmortem on what happened when AWS went down. They lost 30% of their instances across 3 AZs in the US-East region. Rich detail on the impact of the AWS, but not much on what they can do about it in the future, probably because there's not much to do unless you want to take the multi-region hit.
- Forget the money, follow the lack of power. Saleforce, like Amazon, suffered an outage because of a power failure. Why don't these expensive backup power systems seem to work?
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

1 Comment |

Permalink |