Wednesday
Dec052012
5 Ways to Make Cloud Failure Not an Option

With cloud SLAs generally being worth what you don't pay for them, what can you do to protect yourself? Sean Hull in AirBNB didn’t have to fail has some solid advice on how to deal with outages:
- Use Redundancy. Make database and webserver tiers redundant using multi-az or alternately read-replicas.
- Have a browsing only mode. Give users a read-only version of your site. Users may not even notice failures as they will only see problems when they need to perform a write operation.
- Web Applications need Feature Flags. Build in the ability to turn off and on major parts of your site and flip the switch when problems arise.
- Consider Netflix’s Simian. By randomly causing outages in your application you can continually test your failover and redundancy infrastructure.
- Use multiple clouds. Use Redundant Arrays of Inexpensive Clouds as a way of surviving outages in any one particular cloud.
None of these are easy and it's worth considering that your application may not need them at all. Life will almost always go on anyway.
Sean has many more details in AirBNB didn’t have to fail.
Reader Comments (2)
"None of these are easy..."
Is having a browse only mode so hard to implement?
@rc: maybe not hard but requires extra work.
Imagine putting if (db is down) then disable form submit on every zone of your site
That accepts data input.
As hard as putting indexes on a 200 table database:)