A bit over a month ago Amazon experienced its infamous AWS outage in the US East Region. As a cloud evangelist, I was intrigued by the history of the outage as it occurred. There were great posts during and after the outage from those who went down. But more interestingly for me as architect were the detailed posts of those who managed to survive the outage relatively unharmed, such as SimpleGeo, Netflix,SmugMug, SmugMug’s CTO, Twilio, Bizo and others.
Reading through the experience of others, I tried to summarize the patterns, principles and best practices that emerged from these posts, as I believe we can learn a lot from them on how to design our business applications to truly leverage on the benefits that the cloud offers in high availability and scalability.
The main principles, patterns and best practices are:
Click to read more ...