Risk management at a systemic level is complicated enough that many organizations deem it practically impossible. The mistake many risk managers make is to try to identify every potential exposure in the system, every possible scenario that could lead to loss. This is how risk managers go crazy, since not even Kafka can describe every potential possibility. Risk management as a discipline does line up nicely with probability theory, but holistic approaches to risk management deviate from the sister science of insurance.

Venice. Yeah. Try and get flood insurance *there*.

Insurance presents expected value of specific events taking place: what is the probability this car and this driver will be involved in a collision — and how much will the resulting damage cost to replace/fix? Factors include the age and quality of the car as well as the age and quality of the driver, average distance driven per day, geographic area and traffic conditions. The value of the vehicle is estimated, ranges of collision costs assumed. Flood insurance is similarly specific: what is the probability this property will sustain damage in flood conditions — and how much will it cost to protect/fix the property? Average precipitation, elevation, foundation quality, assessed property value are all factored into the decision.

As complicated as actuarial science is, insurance can be written because insurance is specific. Risk management is not specific: it is systemic.

If statistics are going to be used to guide investment in risk mitigation, the mitigation plan is limited to discrete risk exposures. Open ended exposures like “the company gets hacked” is not answered well in dollars and cents. A little hack? A big hack? A hack facilitated by an insider? A hack that results in stacks of employee data? Of pre-market intellectual property? A hack that destroys customer data?

For evaluating complex systems, like organizations, networks, or ecosystems, methods featuring systems analysis as an approach are more likely to be successful than itemizing risks and summing across multiple domains.  One benefit of this is that clearer pictures can be derived with less supporting quantitative data. Another is that time to construct a working risk model is less — an assessment can be conducted top-down rather than bottom-up. Given the lack of detail and specificity, I am dubbing this “Impressionist” Risk Management, also (in my mind) known as “intuitive” or “good enough” Risk Management. (Or, after a brief twitter exchange with @beaker and @dakami, “optimistic” Risk Management).

This is presented as an alternative to more absolutist approaches, such as the ones Mike Dahn described as examples in his BSides Boston talk Tastes Great vs Less Filling: Deconstructing Risk Management (A Practical Approach Towards Decision Making): “Just as there are two sides to every coin, there are two schools of thought in risk management. One camp believes that there is never enough data to make statistically significant risk decisions, due to the unknown-unknowns and never really knowing the entire population of data breaches. Another camp believes that we have well detailed information about specific domains and using Bayesian math we can come to conclusions on how to manage risk.” –> from Mike’s abstract. I think Mike was setting up these approaches (which I dub “absolutist”) in order to then make an argument for more practical, common sense approaches. I agree with focusing on pragmatism and action, so want to add a third camp {pitches tent here}.

Using data to inform design decisions is the right thing to do, and statistics can bring some level of precision that is quite useful. However, in the case of setting a risk management strategy, it’s actually less of a concern how much data is available to sample, and more important how risk exposures are selected to examine.  Classic risk management uses a “simple” approach that is actually quite effective: identify assets, assign value. Determine exploitable vulnerabilities. Identify likely threats. Implement reasonable mitigations. Monitor. Revise. Don’t spend more on mitigation than the assets are worth. Generally a reasonable working risk assessment model for a system can be developed by a small team (that has working knowledge of the mechanics of the system) within a week or less, using a whiteboard and stickie notes. Getting a comprehensive assessment of course takes longer, since subject matter experts and data are needed from all over the system to prove assumptions or identify gaps that an initial team would never observe casually.

As in many domains, when it comes to managing risk, the perfect is the enemy of the good.

Beyond exposure identification, another dimension that needs to be considered is the presence of change drivers within a system. Except in cases of catastrophic failures (hurricanes, fires, and plagues of locusts) most change within a system is based on behavior of participants or interconnections between components. These changes can be approximated if we have some understanding of the drivers. In economics, it’s supply vs demand. In transaction-based payment systems, it’s cost vs liability. In a social network, maybe it’s quantity vs quality. It is important to understand drivers because often drivers can be influenced more easily than controls can be completely implemented (i.e. transmuting an exposure can be easier than eliminating it). For example, instead of reducing the vulnerability surface of a system, it may pay to reduce the perceived value of exploiting the vulnerabilities that do exist. Of course in some cases (like catastrophic failures) mitigations may not work at all.

Both impressionist and absolutist approaches to risk management have a hard time correctly incorporating catastrophe into risk models, but for different reasons. Impressionists get distracted by catastrophes; they are the easiest risks to identify but are often orthogonal to system interdependencies. Absolutists are attracted to catastrophes because more data is typically available; many non-catastrophic risk exposures are exploited every day but there is a limited feedback loop to be reincorporated into a quantitative model. Of course, catastrophes can introduce skew into incomplete distributions. Which brings me to another problem with absolutist approach: the process of summing risk exposure can result in total system-wide exposure being greater than the value of the system itself — this could be corrected by a quantitative understanding of the interdependencies in the system, but is (probably) impossible to model effectively.

Ultimately, self-interested and self-aware system owners will try to understand their risk exposures, at least informally, if only to keep systems operational. (Not that we can assume all critical systems in our ecosystem have owners that are self-interested or self-aware, and “operational” is relative.) A whole insurance policy will probably not be an option to addressing systemic risk exposures, which leaves system owners with some decisions to make. While the depth (how deep into the technical/organizational stack needs to be examined) and complexity (the economics and change drivers of a system) make conducting a holistic and comprehensive risk assessment difficult, giving it a shot with imperfect data and no experience is better than no assessment at all. At least to get started. The best way to go? Think Monet and VanGogh: Start with the obvious, use broad brush strokes, and stand back every once in a while to see what picture appears.

Under Creative Commons License: Attribution