High Scalability -

Entries in Strategy (358)

Tuesday

Feb032009

10 More Rules for Even Faster Websites

Tuesday, February 3, 2009 at 4:12AM

Update:How-To Minimize Load Time for Fast User Experiences. Shows how to analyze the bottlenecks preventing websites and blogs from loading quickly and how to resolve them. 80-90% of the end-user response time is spent on the frontend, so it makes sense to concentrate efforts there before heroically rewriting the backend. Take a shower before buying a Porsche, if you know what I mean. Steve Souders, author of High Performance Websites and Yslow, has ten more best practices to speed up your website:

Split the initial payload

Load scripts without blocking

Don’t scatter scripts

Split dominant content domains

Make static content cookie-free

Reduce cookie weight

Minify CSS

Optimize images

Use iframes sparingly

To www or not to www Sadly, according to String Theory, there are only 26.7 rules left, so get them while they're still in our dimension. Here are slides on the first few rules. Love the speeding dog slide. That's exactly what my dog looks like traveling down the road, head hanging out the window, joyfully battling the wind. Also see 20 New Rules for Faster Web Pages.

Click to read more ...

Todd Hoff |

8 Comments |

Permalink |

Print Article

Email Article

Performance,

Strategy,

website

Saturday

Jan172009

Intro to Caching,Caching algorithms and caching frameworks part 1

Saturday, January 17, 2009 at 1:09AM

Informative and well organized post on caching. Talks about: Why do we need cache?, What is Cache?, Cache Hit, Cache Miss, Storage Cost, Retrieval Cost, Invalidation, Replacement Policy, Optimal Replacement Policy, Caching Algorithms, Least Frequently Used (LFU), Least Recently Used (LRU), Least Recently Used 2(LRU2), Two Queues, Adaptive Replacement Cache (ACR), Most Recently Used (MRU), First in First out (FIFO), Distributed caching, Measuring Cache.

Click to read more ...

General Chicken |

Strategy: Understanding Your Data Leads to the Best Scalability Solutions

Friday, January 2, 2009 at 2:53AM

In article Building Super-Scalable Web Systems with REST Udi Dahan tells an interesting story of how they made a weather reporting system scale for over 10 million users. So many users hitting their weather database didn't scale. Caching in a straightforward way wouldn't work because weather is obviously local. Caching all local reports would bring the entire database into memory, which would work for some companies, but wasn't cost efficient for them. So in typical REST fashion they turned locations into URIs. For example: http://weather.myclient.com/UK/London. This allows the weather information to be cached by intermediaries instead of hitting their servers. Hopefully for each location their servers will be hit a few times and then the caches will be hit until expiry. In order to send users directly to the correct location an IP location check is performed on login and stored in a cookie. The lookup is done once and from then on out a GET is performed directly on the resource. There's no need to hit their servers and do a lookup on the user to get the location. That's all bypassed. I like Udi's summary of the approach and is why I think this is a good strategy : This isn’t a “cheap trick”. While being straight forward for something like weather, understanding the nature of your data and intelligently mapping that to a URI space is critical to building a scalable system, and reaping the benefits of REST.

Click to read more ...

Todd Hoff |

Strategy: Facebook Tweaks to Handle 6 Time as Many Memcached Requests

Saturday, December 13, 2008 at 5:19AM

Our latest strategy is taken from a great post by Paul Saab of Facebook, detailing how with changes Facebook has made to memcached they have:

...been able to scale memcached to handle 200,000 UDP requests per second with an average latency of 173 microseconds. The total throughput achieved is 300,000 UDP requests/s, but the latency at that request rate is too high to be useful in our system. This is an amazing increase from 50,000 UDP requests/s using the stock version of Linux and memcached.

To scale Facebook has hundreds of thousands of TCP connections open to their memcached processes. First, this is still amazing. It's not so long ago you could have never done this. Optimizing connection use was always a priority because the OS simply couldn't handle large numbers of connections or large numbers of threads or large numbers of CPUs. To get to this point is a big accomplishment. Still, at that scale there are problems that are often solved.

Some of the problem Facebook faced and fixed:

Per connection consumption of resources. What works well at low number of inputs can totally kill a system as inputs grow. Memcached uses a per-connection buffer which adds up to a lot of memory that could be used to store data. Nothing wrong with this design choice, but Facebook made changes to use a per-thread shared connection buffer and reclaimed gigabytes of RAM on each server.

Kernel lock contention. Facebook discovered under load there was lock contention when transmitting through a single UDP socket from multiple threads. Sockets are data structures too and they are subject to the usual lock contention issues. Facebook got around this issue by maintaining separate reply sockets in different threads so they would not contend with the receive sockets. They found another bottleneck in Linux’s “netdevice” layer that sits in-between IP and device drivers. They changed the dequeue algorithm to batch dequeues so more work was done when they had the CPU.

Application lock contention. Nothing brings out lock issues like moving to more cores. Facebook found when they moved to 8 core machines a global lock protecting stats collection used 20-30% of CPU usage. In application that require little processing per request, as does memcached, this is not unexpected, but doing real work with your CPU is a better idea. So they collected stats on a per thread basis and then calculated a global view on demand.

Interrupt floods and starvation. With so much traffic directed at a single server the hardware can flood the CPU(s) with interrupts and keep the CPU from doing "real" work. To get around this problem Facebook implements some complicated strategies to load balance IO across all the cores. As I am less clever I might try more network cards with a TCP Offload engine.

When you read Paul's article keep in mind all the incredible number of man hours that went into profiling the system, not just their application, but the entire software hardware stack. Then add in the research, planning, and trying different solutions to see if anything changed for the better. It's a lot of work. Notice using a nifty new parallel language or moving to a cloud wouldn't have made a bit difference. It's complete mastery of their system that made the difference.

A summary of potential strategies:

Profile everything. Problems are always specific. The understanding of the problem must be specific. The fix must be specific.

Burn profiling into your regression tests. Detect when and where performance tanks as a regular part of your build.

Use resources in proportion to what grows slowest. This requires multiplexing, but at least your resource usage is more predictable and bounded.

Batch work. When you have the CPU do all the work you possibly can in the quantum or the whole system grinds to a halt in processing overhead.

Do work and maintain resources per task. Otherwise locking for shared resources takes more and more time when there's less and less time to do the work that needs to be done.

Change algorithms. Sometimes you simply need to do things differently. Tweaking will only get you so far.

You can find their changes on github, the hub that says "git."

Todd Hoff |

3 Comments |

Permalink |

facebook

Thursday

Nov132008

Plenty of Fish Says Scaling for Free Doesn't Pay

Thursday, November 13, 2008 at 1:39PM

Plenty of FishCEO Markus Frind, famous nerd hero for making over $10 million a year from Google ads on a free dating site he made and ran all by himself, now sees a problem with the free model:

The problem with free is that every time you double the size of your database the cost of maintaining the site grows 6 fold. I really underestimated how much resources it would take, I have one database table now that exceeds 3 billion records. The bigger you get as a free site the less money you make per visit and the more it costs to service a visit...There is really no money in being free and we have to start experimenting with other models now or we won’t be able to compete in 3 or 4 years.

As one commenter succinctly put it: the “golden time” of AdSense is over. Time to look at costs. The POF architecture is to run scarily huge tables on single machines. They also buy and maintain their own SAN. So it seems scaling up is what is increasing costs and decreasing profits. I wonder if the economics of cloud storage and cloud architectures might have a more linear cost curve?

Click to read more ...

Todd Hoff |

22 Comments |

Permalink |

Print Article

Email Article

Strategy,

cloud

Monday

Nov032008

How Sites are Scaling Up for the Election Night Crush

Monday, November 3, 2008 at 10:50PM

Election night is a big traffic boost for news and social sites. Yahoo expects up to 400 million page views on Election Day. Data Center Knowledge has an excellent article how various sites are preparing to handle spikes in election night traffic. Some interesting bits:

Prepare ahead. Don't wait to handle spikes, plan and prepare before the blessed event.

Use a CDN. Daily Kos puts images on a CDN, but the dynamic nature of their site means the can't use CDN for their other content.

Scale up. Daily Kos "to handle the traffic better, we moved to a cluster of six quad core Xeons with 8GB RAM for webheads that all boot off a central NFS (Network File System) root, with the capability of adding more webheads as needed,” . They also "added two 16GB eight-core Xeons and a 6×73GB RAID-10 array for database files running a MySQL master/slave setup."

Add Cache. Daily Kos added 1GB instances memcached to each webhead.

Change Caching Strategy. Daily Kos puts fully rendered pages into memcached.

Change Serving Strategy. Daily Kos directly serves cached pages from memcached directly to anonymous users from lighttpd running as the front end proxy. The moves a lot of work off the backend and distributes work on the new hefty webheads. Site performance has improved greatly.

Add Capacity. Limelight expanded its network capacity to over 2 Terabytes per second. Tonight is a big night for a lot of sites. It's interesting to see how some are responding to the challenge. A lot of what they are doing will work for you too.

Click to read more ...

Todd Hoff |

5 Comments |

Permalink |

Print Article

Email Article

Strategy

Sunday

Nov022008

Strategy: How to Manage Sessions Using Memcached

Sunday, November 2, 2008 at 1:40AM

Dormando shows an enlightened middle way for storing sessions in cache and the database. Sessions are a perfect cache candidate because they are transient, smallish, and since they are usually accessed on every page access removing all that load from the database is a good thing. But as Dormando points out session caches have problems. If you remove expiration times from the cache and you run out of memory then no more logins. If a cache server fails or needs to be upgrade then you just logged out a bunch of potentially angry users. The middle ground Dormando proposes is using both the cache and the database:

Reads: read from the cache first, then the database. Typical cache logic.

Writes: write to memcached every time, write to the database every N seconds (assuming the data has changed). There's a small chance of data loss, but you've still greatly reduced the database load while providing reliability. Nice solution.

Click to read more ...

Todd Hoff |

6 Comments |

Permalink |

sessions

Sunday

Oct262008

Should you use a SAN to scale your architecture?

Sunday, October 26, 2008 at 1:52AM

This is a question everyone must struggle with when building out their datacenter. Storage choices are always the ones I have the least confidence in. David Marks in his blog You Can Change It Later! asks the question Should I get a SAN to scale my site architecture? and answers no. A better solution is to use commodity hardware, directly attach storage on servers, and partition across servers to scale and for greater availability. David's reasoning is interesting:

A SAN creates a SPOF (single point of failure) that is dependent on a vendor to fly and fix when there's a problem. This can lead to long down times during this outage you have no access to your data at all.

Using easily available commodity hardware minimizes risks to your company, it's not just about saving money. Zooming over to Fry's to buy emergency equipment provides the kind of agility startups need in order to respond quickly to ever changing situations. It's hard to beat the power and flexibility (backups, easy to add storage, mirroring, etc) of a good SAN, but Mark makes a good case.

Click to read more ...

Todd Hoff |

8 Comments |

Permalink |

Print Article

Email Article

Strategy,

storage

Friday

Oct242008

11 Secrets of a Cloud Scale Consultant That They Dont' Want You to Know

Friday, October 24, 2008 at 12:11AM

OK, there is no "they" and "they" wouldn't care if you knew anyway. After all, this isn't a blog about really important stuff like investing, acne cures, or cheap natural cleansing products. But the secrets are real. Super cloud scaling consultant Kent Langley has put together a comprehensive checklist to consider when developing for the cloud:

ORM for Data Partitioning and Query Splitting - Split queries between updates and deletes from the start

Monitoring process, resources, and uptime - Process Monitoring, Resource Monitoring, UpTime Monitoring

Performance Testing and Capacity Planning - Can't make good decisions without doing some degree of Performance Testing and Capacity planning.

Static vs. Dynamic Content splitting / CDN - Reverse Proxy, Splitting Static and Dynamic content

Bundling and Compressing JS and CSS - Bundle them, compress, version, and then properly cache those bundles

Logging - Log appropriately and monitor those logs

Pragmatic Caching - Most current web applications will have between 3-5 layers of caching

Functional Decomposition - Decompose your entire application into functional silos

Deployment - It should be efficient, it should have a roll back capability, and it should be almost entirely automated to development

Asynchronous Practices - Most cases work can be queued and done by a separate process

Make sure your application processes are as lean as possible - More efficient code means less servers Please follow the link to Kent's post for a full explanation. To some this may seem obvious, but that doesn't mean it gets done. Good helpful stuff.

Joyent - Cloud Computing Built on Accelerators by Kent Langley

Click to read more ...

Todd Hoff |

7 Comments |

Permalink |

Print Article

Email Article

Strategy

Wednesday

Oct082008

Strategy: Flickr - Do the Essential Work Up-front and Queue the Rest

Wednesday, October 8, 2008 at 1:41AM

This strategy is stated perfectly by Flickr's Myles Grant: The Flickr engineering team is obsessed with making pages load as quickly as possible. To that end, we’re refactoring large amounts of our code to do only the essential work up front, and rely on our queuing system to do the rest. Flickr uses a queuing system to process 11 million tasks a day. Leslie Michael Orchard also does a great job explaining the queuing meme in his excellent post Queue everything and delight everyone. Asynchronous work queues are how you scalably solve problems that are too big to handle in real-time. The process:

Identify the minimum feedback the client (UI, API) needs to know an operation succeeded. It's enough, for example, to update a client's view when a posting a message to a microblogging service. The client probably isn't aware of all the other steps that happen when a message is added and doesn't really care when they happen as long as the obvious cases happen in an appropariate period of time.

Queue all work not on the critical path to a job queueing system so the critical path remains unblocked. Work is then load balanced across a cluster and completed as resources permit. The more sharded your architecture is the more work can be done in parallel which minimizes total throughput time. This approach makes it much easier to bound response latencies as features scale.

Queues Give You Lots of New Knobs to Play With

As features are added data consumers multiply, so throwing a new task into a sequential process has a good chance of blowing latencies. Queueing gives much more control and flexibility over the performance of a system. With queues some advanced strategies you have at your disposal are:

Horizontal scaling. Add more processing resources to do more work in parallel.

Priority order processing. Paying customers, can be processed first, for example. Take measures to avoid starvation.

Aggregation. Work sitting on the same queue for the same user can be aggregated together so it can be processed as a batch.

Work canceling. A request later in the queue can cancel work earlier in the queue. These can just be dropped.

CPU limitting. When jobs have unbounded CPU time it destroys the latency for other jobs sitting in the queue. Bounding CPU limits on jobs evens out latency for everyone.

Low priority work dropping. Under load low priority jobs can be dropped. Just make you have background sweep processes that catch work that should have been done and redoes it.

Admission control. Under load clients can be told about when to retry. This is the best form of flow control, end-to-end flow with the client. We want to push back on work as high up the stack as we can. Stop the client from pushing work to you and you've accomplished something. Just having blind retries and timeouts puts immense pressure on the whole system. These ideas have been employed in embedded real-time systems forever and now it seems they'll move into web services as well.

What Can You do with Your Queue?

The options are endless, but here are some uses I found out in the wild:

Backfill jobs. Backfill is what Flickr calls asynchronous job that: alter database tables in preparation for a new feature; fix existing features; or other operation that touch a lot of accounts, photos, or groups. For example, a sharding approach means related data is spread through many different shards. To delete a user account would require visiting each shard to delete that users data. Each of those deletes would be queued to they could be done in parallel. Now lets say a bug prevented some of the user data from deleting. After the bug was fixed the user data for all the impacted user accounts would have to be scheduled to be deleted again.

Low latency funciton call router.

Scatter/gather calls in paralellel.

Defer expensive library calls.

Parellize database queries.

Job queue system for a cluster. Efficiently use all your pool of CPU power.

Sending scheduled mail merged emails.

Creating guest hosts

Put heavy code on backend instead of the web server.

Call a cron script to update topic hits and popular article hits.

Clean useless data from database because it's outdated.

Resize photos.

Run daily reports.

Update search indexes.

Speed up batch jobs by running them in parallel.

SpamAssassin spamtraps.

Queuing Implies an Event Driven State Machine Based Client Architecture

Moving to queuing has architecture implications. The client and server are nolonger connected in a direct request-response sort of way. Instead, the server continually sends events to clients. The client is event driven instead of request-response driven. Internally clients often simulates the reqest-response model even though Ajax is asynchronous. It might be better to drop the request-response illusion and just make the client an event driven state machine. An event can come from a request, or from asynchronous jobs, or events can be generated by others performing activities that a client should see. Each client has an event channel that the system puts events on for a client to consume. The client is responspible for making sense of the event in its current context and is capable of handling any event regardless of its original source.