Resiliency is the New Normal - A Deep Look at What It Means and How to Build It
Monday, December 3, 2012 at 9:08AM
HighScalability Team

Perhaps it is because the whole world feels as if it’s riding on the edge of a jagged knife that the idea of resilience is becoming a dominant theme across so many domains. Resilience in beings first developed when cells evolved a way of maintaining inner order through homeostatic (stability through constancy) mechanisms. After homeostasis was mastered, allostasis (stability through change) developed as a way of responding to a dynamic world of challenge. In economics we have the idea of Transition Towns, which emphasizes developing local economies as a way of being resilient to global failures. In agriculture we have the idea of permaculture, building a permanent agriculture by embracing diversity, sustainability, perennial systems, avoiding monocultures, and using edge thinking. There are many more examples, including psychological resilience and the legendary resilience of ecosystems.

To explore the idea of resiliency we’ll look at a few sources:

The talk by Dr. Richard Cook was given at Velocity 2012 and is by far the most practical of all the talks, as it directly relates to DevOps, but I think each of the other talks holds their own special fascination as well. I hope you’ll share my conviction that this incredibly cool stuff that has only really begun to be explored and applied.

Collapse Dynamics: Phase Transitions in Complex Social Systems

Noah Raford has a great series of amazing videos deeply related to resilience: Collapse Dynamics: Phase Transitions in Complex Social Systems.

Some key ideas from his talk:

Taleb on Black Swans, Fragility, and Mistakes

Nassim Taleb of Black Swan fame has resiliency at the heart of many of his ideas. Taleb on Black Swans, Fragility, and Mistakes is a very good talk on the subject.

Some key ideas from his talk:

Why Cities Keep on Growing, Corporations Always Die, and Life Gets Faster

Geoffrey B. West, theoretical physicist at the Santa Fe Institute, gives some really awesome talks. Take a look at Why Cities Keep on Growing, Corporations Always Die, and Life Gets Faster, he’s on TED, and he’s all over YouTube. His talk on the  Scaling Laws In Biology And Other Complex Systems is definitely worth an investment of time.

Some key ideas from his talks:

How Complex Systems Fail

There’s a long a history of thinking about resiliency in computer systems. Autonomic Computing is one such vision. The current, more pragmatic, champions of resilience in the software world is the modern DevOps movement.

On this subject Dr. Richard Cook, Professor of Healthcare Systems Safety and Chairman of the Department of Patient Safety at the Kungliga Techniska Hogskolan, was invited to talk at the Velocity 2012 conference. He gave a fascinating talk: How Complex Systems Fail, that is just detailed enough to be practical and high level enough to inspire new directions.

Why Don’t Systems Fail More Often?

The normal world is not well behaved. The real surprise is not that there are so many accidents but there are so few. Is this because of or in spite of our system designs? We all  have had the sense of barely escaping our just getting by. It seems like we should have crashes all the time. Why is that? What does that mean about IT design implementation and ops?

Summary of 25 years of Research

System as Imagined vs System as Found

What are people doing in these As Found systems? What should operations look like?

Resilience is the combination in systems of these four activities:

These are terms of what we are trying to describe as resilience.

Reliability is made out of these things at design time:

What we really want is resilience:

How do we design for resilience?

What’s the resilience agenda?

Final Thoughts

Unfortunately there’s no easy way to wrap all this up into a tight little TLDR. “Be resilient” just sounds a little silly after experiencing the vast richness of the subject. It’s a pool of infinite depth.

And some of it is outright depressing. The idea that we need to keep increasing the pace of innovation just to stave off the next collapse is sobering. That there are limits to growth. That the same interconnectivity we crave as a way of bringing more richness to the world is also the seed of inevitable collapse.

It confounds me that software is simply not the right tool for creating software. And I find perplexing, in an all too human way, that diversity and heterogeneity are such key aspects of resilience, yet we continually find ourselves shunted onto platforms of convenience.

Dr. Richard Cook had a more easily recognizable path, one that DevOps is well on its way to following. Leading companies have been successfully unifying the System as Imagined with the System as Found, so that there aren’t these disparate communities formed around a system. Learning is being pushed up to the developers and back down through the code so the System as Found can become wedded to the System as Imagined through the entire stack.

But what Dr. Cook asks for is something developers can’t deliver: such a clear understanding of a complex system that you can hold it in the palm of your hand, turn it, twist it, interrogate it, and make it dance to your tune. Complex systems can only be built incrementally, which means there is only ever an incremental understanding of how the whole thing works, which means it can never be opened to the degree he wishes. A system will always be in large part subconscious, just like how in the the human brain the conscious mind is only the smallest window on a vast subconscious mind.

If there is a common theme it’s that shit happens, you can’t predict it, you can’t stop it, but you can be prepared for the next transition.

Related Articles

Article originally appeared on (http://highscalability.com/).
See website for complete article licensing information.