Entries by HighScalability Team (1576)

Tuesday
Jul102012

Sponsored Post: New Relic, NetDNA, Torbit, GigaSpaces, AiCache, Logic Monitor, AppDynamics, CloudSigma, ManageEngine, Site24x7

Who's Hiring? 

  • Torbit is hiringCare about performance? Care about making the internet faster and better? At Torbit we use lots of Golang, Node.js, JavaScript and PHP to solve big challenges.

Fun and Informative Events

  • Your event could be here.

Cool Products and Services

  • New benchmarking report proves GigaSpaces XAP as 57 times faster than VMWare GemFire. See complete comparison here Cloudify blogGigaSpaces blog.
  • New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
  • NetDNA, a Tier-1 GlobalContent Delivery Network, offers a Dual-CDN strategy which allows companies to utilize a redundant infrastructure while leveraging the advantages of multiple CDNs to reduce costs.
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free.  No sign-up required. http://aicache.com/deploy
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

For a longer description of each sponsor, please read more below...

Click to read more ...

Monday
Jul092012

Data Replication in NoSQL Databases

This is the third guest post (part 1, part 2) of a series by Greg Lindahl, CTO of blekko, the spam free search engine. Previously, Greg was Founder and Distinguished Engineer at PathScale, at which he was the architect of the InfiniPath low-latency InfiniBand HCA, used to build tightly-coupled supercomputing clusters.

blekko's home-grown NoSQL database was designed from the start to support a web-scale search engine, with 1,000s of servers and petabytes of disk. Data replication is a very important part of keeping the database up and serving queries. Like many NoSQL database authors, we decided to keep R=3 copies of each piece of data in the database, and not use RAID to improve reliability. The key goal we were shooting for was a database which degrades gracefully when there are many small failures over time, without needing human intervention.

Why don't we like RAID for big NoSQL databases?

Click to read more ...

Friday
Jul062012

Stuff The Internet Says On Scalability For July 6, 2012

It's HighScalability Time (with 33% more goodness for the same NoPrice):

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Jul042012

Top Features of a Scalable Database

This is a guest post by Douglas Wilson, EMEA Field Application Engineer at Raima, based on insights from biulding their Raima Database Manager.

Scalability and Hardware

Scalability is the ability to maintain performance as demands on the system increase, by adding further resources. Normally those resources will be in the form of hardware. Since processor speeds are no longer increasing much, scaling up the hardware normally means adding extra processors or cores, and more memory.

Scalability and Software

However, scalability requires software that can utilize the extra hardware effectively. The software must be designed to allow parallel processing. In the context of a database engine this means that the server component must be multi-threaded, to allow the operating system to schedule parallel tasks on all the cores that are available. Not only that, but the database engine must provide an efficient way to break its workload into as many parallel tasks as there are cores. So, for example, if the database server always uses only four threads then it will make very little difference whether this server runs on a four-core machine or an eight-core machine.

Distributed Design

Click to read more ...

Monday
Jul022012

C is for Compute - Google Compute Engine (GCE)

After poking around the Google Compute Engine (GCE) documentation I had some trouble creating a mental model of how GCE works. Is it like AWS, GAE, Rackspace, just what is it? After watching Google I/O 2012 - Introducing Google Compute Engine and Google Compute Engine -- Technical Details, it turns out my initial impression, that GCE is disarmingly straightforward, turns out to be the point.

The focus of GCE is on the C, which stands for Compute, and that’s what GCE is all about: deploying lots of servers to solve computationally hard problems. What you get with GCE is a Super Datacenter on Google Steroids.

If you are wondering how you will run the next Instagram on GCE then that would be missing the point. GAE is targeted at applications. GCE is targeted at:

Click to read more ...

Friday
Jun292012

Stuff The Internet Says On Scalability For June 29, 2012 - The Velocity Edition

Judging from the tweet flow, Velocity looked like a riotous good time. In this video on the main themes at Velocity, after a little microphone enhanced violence, John Allspaw and Steve Souders identify resilience and automation as two of the big ideas behind building a faster and stronger web.

John says resiliency is the idea that we we don't live in a perfect world so trying to build perfect systems is counter productive. We have to accept failure as a baseline and think in terms of degrees of availability. All abstraction layers leak so every part of a system must be monitorable and open to introspection.

A focus on resilience means the web is growing up. Resilience has long been a requirement for "real" systems, it's great to see the web thinking in terms of the complex systems they've always been. For the Alpha and Omega on resilience you'll want to watch Dr. Richard Cook's inspiring talk on How Complex Systems Fail

Here are some of the most enjoyable Quotable Quotes from Velocity:

  • @guypod : LTE latency has roughly the same latency we had with dialup connections. 3G latency is akin to satellite... (@patmeenan at ‪#velocityconf‬)
  • @akucharski : Akamai produces 1.3 billion log lines every day! ‪#velocityconf‬
  • @mikeodea : Facebook: 6 billion mobile messages (!!) every 30 minutes ‪#velocityconf‬
  • @mmaretzke : ‪#velocityconf‬ Last ... mind-boggling ... Facebook facts: 3.8 trillion cache operations in 30 minutes! Unbelievable. Scaling Systems. 160m newsfeeds, 5bln realtime msgs, 10bln profile pics, 108 bln queries on mysql still 30 minutes
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...

Wednesday
Jun272012

Paper: Logic and Lattices for Distributed Programming

Neil Conway from Berkeley CS is giving an advanced level talk at a meetup today in San Francisco on a new paper: Logic and Lattices for Distributed Programming - extending set logic to support CRDT-style lattices. 

The description of the meetup is probably the clearest introduction to the paper:

Developers are increasingly choosing datastores that sacrifice strong consistency guarantees in exchange for improved performance and availability. Unfortunately, writing reliable distributed programs without the benefit of strong consistency can be very challenging.

 

In this talk, I'll discuss work from our group at UC Berkeley that aims to make it easier to write distributed programs without relying on strong consistency. Bloom is a declarative programming language for distributed computing, while CALM is an analysis technique that identifies programs that are guaranteed to be eventually consistent. I'll then discuss our recent work on extending CALM to support a broader range of programs, drawing upon ideas from CRDTs (A Commutative Replicated Data Type).

If you have an eye towards understanding the future then this is for you.

Click to read more ...

Tuesday
Jun262012

Sponsored Post: New Relic, Digital Ocean, NetDNA, Torbit, Reality Check Network, Gigaspaces, AiCache, Logic Monitor, AppDynamics, CloudSigma, ManageEnine, Site24x7

Who's Hiring? 

  • Torbit is hiringCare about performance? Care about making the internet faster and better? At Torbit we use lots of Golang, Node.js, JavaScript and PHP to solve big challenges.

Fun and Informative Events

Cool Products and Services

  • New Relic - real user monitoring optimize for humans, not bots. Live application stats, SQL/NoSQL performance, web transactions, proactive notifications. Take 2 minutes to sign up for a free trial.
  • NetDNA, a Tier-1 GlobalContent Delivery Network, offers a Dual-CDN strategy which allows companies to utilize a redundant infrastructure while leveraging the advantages of multiple CDNs to reduce costs.
  • Digital Ocean is a Simple Cloud Hosting platform that offers Free Unlimited Bandwidth and Virtual Servers from $10 per month. Sign up for free and set-up your virtual server in 60 seconds or less.
  • Reality Check Network offers powerful hosting solutions and managed servers for high traffic/bandwidth websites backed by unlimited network, server and application support.
  • aiCache creates a better user experience by increasing the speed scale and stability of your web-site. Test aiCache acceleration for free.  No sign-up required. http://aicache.com/deploy
  • LogicMonitor - Hosted monitoring of your entire technology stack. Dashboards, trending graphs, alerting. Try it free and be up and running in just 15 minutes.
  • AppDynamics is the very first free product designed for troubleshooting Java performance while getting full visibility in production environments. Visit http://www.appdynamics.com/free.
  • CloudSigma. Utility style high performance cloud servers in the US and Europe delivered on all 10GigE networking. Run any OS, take advantage of SSD storage and tailored infrastructure options.
  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.
  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

For a longer description of each sponsor, please read more below...

Click to read more ...

Monday
Jun252012

StubHub Architecture: The Surprising Complexity Behind the World’s Largest Ticket Marketplace

StubHub is an interesting architecture to take a look at because, as market makers for tickets, they are in a different business than we normally get to consider.

StubHub is surprisingly large, growing at 20% a year, serving 800K complex pages per hour, selling 5 million tickets per year, and handling 2 million API calls per hour. 

And the ticket space is surprisingly rich in complexity. StubHub's traffic is tricky. It's bursty, centering around unpredictable game outcomes, events, schedules, and seasons. There’s a lot of money involved. There are a lot of different actors involved. There are a lot of complex business processes involved. And StubHub has several complementary but very different parts of their business: they have an ad server component serving ads to sites like ESPN, a rich interactive UI, and a real-time ticket market component.

Most interesting to me is how StubHub is bringing into the digital realm the once quintessentially high-touch physical world of tickets, point-of-sale systems, FedEx delivery, buyers and sellers, and money. They are making it happen with deep electronic integration into organizations (like Major League Baseball) and a Lifecycle Bus that moves complex business processes out of the application space.

It's an interesting problem made more complex by having to move forward while dealing with legacy systems built when getting business building features out the door was the priority. Let's see how StubHub makes it all work...

Click to read more ...

Friday
Jun222012

Stuff The Internet Says On Scalability For June 22, 2012

It's HighScalability Time:

  • Quoteable Quotes:
    • @xinqiyang: Partition, replicate, index. Many efficiency and scalability problems are solved the same way.
    • @SnideLemon: Let's switch to a bottle of wine for economic scalability." -- the best justification for additional drinking ever‬
    • @cloudbees: "A whopping 57% of respondents cited desire for scalability as their chief motivation to go to cloud"
  • You are the next computer. Cells are capable of arithmetic using two naturally occurring molecules: erythromycin, an antibiotic, and phloretin, found in Apple trees. "These act as inputs, switching a reaction within the two types of cell on or off. The reaction leads to the production of a red or green fluorescent protein that signals the result of the calculation. For example, in the half adder cell, the presence of both molecules makes it glow red."
Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge...

Click to read more ...