We've referenced this 189 slide masterpiece by Ask Bjorn Hansen before, but it was hidden without its own first class link. He describes his presentation as 3 hours of 5 minute lightening talks and that sounds about right.
The presentation covers: overall platform and architecture considerations involved in tuning applications from a holistic perspective. You’ll be shown design scalable architectures for dynamic, high-volume web sites. Topics covered include caching, scalable database design, replication architecture, load-balancing, and architectural decisions derived from many years of experience.
His prime directive of scaling: Think Horizontally at every point in your architecture, not just at the web tier.
You may not agree with everything, but there's a lot of useful advice. Here's a summary of some of what is covered:
Benchmarking
Vertical scaling sucks.
Horizontal scaling rocks.
Run many application servers
Don't keep state in the app server
Be stateless
Optimization is necessary, but is different than scalability.
Cache things you hit all the time.
Measure, don't assume, check.
Make pages static.
Caching is a trade-off.
Cache full pages.
Cache partial pages.
Cache complex data.
MySQL query cache is flushed on update.
Cache invalidation is hard.
Replication scales reads, not writes.
Partition to scale writes. 96% of applications can skip this step.
Master-master setup facilitates on-line schema changes.
Create summary tables and summary databases rather than do COUNT and GROUP-BY at runtime.
Make code idempotent. If it fails you should just be able to run it again.
Load data asynchronously. Aggregate updates into batches.
Move processing to application and out of the database as much as possible.
Stored procedures are dangerous.
Add more memory.
Enable query logging and take a look at what your app is doing.
Run different MySQL instances for different work loads.
Config tuning helps, query tuning works.
Reconsider persistent DB connections.
Don't overwork the database. It's hard to scale.
Work in parallel.
Use a job queuing system.
Log http requests.
Use light processes for light tasks.
Build on APIs internally. Clean loosely coupled APIs are easy to scale.
Don't incur technical debt.
Automatically handle failures.
Make services that always work.
Load balancing is the key to horizontal scaling.
Redundancy is not load-balancing. Always have n+1 capacity.
Plan for disasters.
Make backups.
Keep software deployments easy.
Have everything scripted.
Monitor everything. Graph everything.
Run one service per server.
Don't ever swap memory for disk.
Run memcached if you have extra memory.
Use memory to save CPU or IO. Balance memory vs CPU vs IO.
Netboot your application servers.
There's lot of good slides on what to graph.
Use a CDN.
Use YSlow to find client side problems.
This is just a high level blitz through the presentation. Topics are given a lot more detail in the presentation. Audio of Ask's dulcet tones would be nice, but there's still a lot to learn here.
Reader Comments (3)
I'm familiar with Bjorn's presentation and I concurr that it's a masterpiece. It covers some really essential issues and proposes a number of great advice that I follow in my everyday work at the moment. I stronlgy recommend reading it, as it simply enriches your experience even if you decide not to play by the rules proposed there.
Just an FYI, but his first name is actually 'Ask' not 'Bjorn'.
Ack, thanks for the correction.