- Leandro Herrero - https://leandroherrero.com -

Do you have a ‘Chaos Monkey’ in your management system? You should.

Many organizations have a Risk Management function of some sort. Often scattered amongst different constituencies: manufacturing, engineering, R&D etc. It is also embedded in Quality Systems, such as ISO. Financial institutions, or indeed financial functions within the company, will have some form of system. However, the variations in depth and seriousness are enormous. From well-defined  ‘stress tests’ imposed on banks, to a vague list of potential risks with no more than lip service paid to actions, and one can find anything in between.

In my consulting experience, outside the most standardized areas of operations, in this area of Risk Management, I’ve seen more lip service and bad planning than the opposite.

I have argued in these Daily Thoughts [1] that companies need to devise their own routine ‘stress tests’, beyond the financials, to understand their adaptability and indeed survival. But I’d like to take this further and suggest that these ‘stress tests’ need to be formalised in the leadership capabilities.

A good model is Netflix’s ‘Chaos Monkey’. This is how the successful video streaming company, with lots of avant-garde organizational and management structures, defines their ‘Chaos Monkey’: ‘A tool that randomly disables our production instances to make sure we can survive this common type of failure without any customer impact. The name comes from the idea of unleashing a wild monkey with a weapon in your data center (or cloud region) to randomly shoot down instances and chew through cables — all the while we continue serving our customers without interruption. By running Chaos Monkey in the middle of a business day, in a carefully monitored environment with engineers standing by to address any problems, we can still learn the lessons about the weaknesses of our system, and build automatic recovery mechanisms to deal with them. So next time an instance fails at 3 am on a Sunday, we won’t even notice’.

I think we should hire some of these Monkeys, with proper role descriptions, (OK, job sharing allowed, but not working from home), and give them the formal role of generating some chaos to test our abilities and resilience. And, as in my previous Daily Thought, I am not talking software or technology but in day-to-day business: hiring, product recalls, sudden acquisitions, etc.

Before you make the expected and easy joke that you do already have those Monkeys in your organization, and they are sitting in Marketing, or Sales, or HQ, or, indeed you have some in your own team producing havoc, I’d like you to consider the serious ‘Chaos Monkey’ that I am talking about.

OK, end of playing with words. Do consider formal simulations of how you will cope with unexpected issues, and do extend this to the ‘soft aspects’ of your management, not just the hard ones.

Instead of cables and servers ‘a la Netflix’, imagine processes, systems, your human capital. Do you really know how many people you have ‘at risk’ of leaving soon, and, if you do, do you really have a plan for that.