This simple risk analysis method provides data on hard to measure risks. This method is geared towards security teams who want to reduce risks, and measure those reductions.
It can be explained briefly:
Far more digestable explanations exist in this slideshow, and this medium article, if you are looking for less intimidating introductions to probabilistic risk.
This is based on the Probabilistic Risk Assessment methodology used in aerospace, nuclear, oil, and other environmental industries. This document simplifies these practices and focuses on the available research to improve collaborative forecasting.
When you are done with this document, you might own a forecast that looks something like this.
A remote adversary will access a production server in Q2.
This is not an example of fortune telling. Instead, we are measuring uncertainty about a risk. Uncertainty is part of what drives our intuition about risk, and is the basis for most of the decisions we make.
For example, total uncertainty about the above scenario would result in a 33%/33%/33% forecast. If you wouldn't agree with those probabilities, it's because you have a measurable uncertainty that differs from them.
This document suggests ways to reduce cognitive bias, introduce skepticism, develop evidence gathering, and find consensus in prioritizing risk.
A scenario is influenced by any combination of a threat, vulnerability, or an impact. Choose a scenario that best reflects your concerns and mitigation efforts. Make sure it is time specific, so it can expire in a reasonable time. Here are some examples:
Choose a scenario that best represents the risk you're concerned about, and how you are organizing your effort to reduce that risk.
Here's an example of how this differs, with small variances on a similar risk:
The first scenario is influenced by reducing the odds that a bank robber could get to the vault. Any risk mitigation effort will be centered around the bank robber's ease of access to the vault.
The second scenario is most influenced by limiting the overall amount of funds that are needed in the vault. It assumes that a bank robber,
The third scenario implies aspects of both. Mitigation efforts for the third scenario will be influenced by reducing the probility of the attack, but also anything that would slow the attacker down or limit their capacity to succeed. By being more vague about the threat actor, it also consider something like an insider threat into calculation.
The fourth scenario is related to a specific vulnerability. This would encourage efforts to mitigate the vulnerability entirely, or add compensating controls focused on a specific problem.
You can see that being intentionally specific or vague in developing a scenario allows you to include or exclude known and unknown risks. This concept is developed further here. Skillfully crafting the right scenario can attract mitigation techniques that are properly scoped to your expectations, while allowing and attracting creative solutions.
The scenario needs two qualities for us to forecast it. The first, is a reasonable timeframe the event could occur (months, quarter, year). Second, is a structured set of outcomes for the forecast, which we can build in many ways.
The following forecast examples measure these various aspects of probability and / or impact. Someone could forecast the probabilities involved with any one, or multiple, of these outcomes.
If we choose to use the third example, the forecasts would look like this:
A remote adversary gains a foothold on a production server in Q2.
Some events happen so infrequently, you are simply forecasting if it will happen at all. A nice structure for that is divvying up the probability of something happening once, more than once, or not at all.
85%: Won't happen. 10%: Will happen. 5%: Will happen more than once.
Scenarios related to impact do well with estimating a range of values that you are mostly sure the correct result will fall into. For instance:
I am 90% confident I will lose $0-100
You can also estimate whether an incident will become an initiating event for a larger attack.
10% chance the adversary will go on to pivot throughout our network.
Identify the events that lead up to this larger event. This process looks very similar to threat modeling.
What events might take place before the big event occurs? There are likely very many. Some examples:
The research done around human estimation suggests that forecasters who decompose a problem will have improved forecasts. This is sometimes referred to as the "Fermi Problem".
This is why it's sometimes valuable to substitute relevant scenarios for ones we are lacking data around. For instance, you can better estimate the average height of an apple tree by knowing the average hight of an orange and pear tree. It helps bring you into a ballpark with "outside" information, to which you'll apply your own estimation skills.
You may discover that the risk you have identified already has a "kill chain", or an "attack matrix" defined.
In probability risk assessments in the nuclear industry, these are often called "Initiating Events".
NASA also calls these initating events, and has documented this process here (3.3.1).
Research suggests that training will drastically improve the reliability of our forecasts. This online training resembles the training used in that research. Our ultimate goal is to calibrate the Brier score of people involved with this process. Established biases should be taught to forecasters in this training, as well as concepts like the "overconfidence effect".
We want participants to resemble a weather forecasters as much as possible. That is, someone who is constantly held accountable for the outcomes of their forecasts.
Humans are wildly overconfident, and experts can be even worse. Research strongly suggests this to be the case.
This poor quality can be measured easily in any person. The Brier score is a dead simple method used to measure how calibrated a person's sense of confidence is with their actual knowledge. It can be fixed quickly.
The most important aspect of forecasting is the Brier Score. A feedback loop that holds you accountable for your miscalibration will rapidly improve it. By being constantly tested and accountable for our forecasts, we can quickly prevent ourselves from becoming overconfident.
There are plenty of online materials to recreate the training used in the above cited research. The aim of training is to make individuals self aware of common sources of cognitive bias, our natural reactions to new information, and overconfidence.
This research shows that calibration against
In general, training and awareness of cognitive bias, combined with practice, are critical.
Now your forecasters should have opinions and demands for evidence that influence the potential outcomes of these scenarios. We want to get forecasters decision making data.
Do not limit your imagination to whatever frequency based historical statistics you can pull together.
During training, a forecaster should be hyper aware about their tendancy to overvalue certain types of data, and our process will limit the volatility introduced by a drastic bias. This allows us to introduce all forms of information.
Audit results, interviews, opinion briefs, and cross industry data are all useful and the forecaster will observe them with measure.
Forecaster demand is important. Don't bother collecting data that a forecaster won't find useful. No need to overdo it on collection.
Nobel prize winning research describes effective forecasting as using "outside" and "inside" data. For instance, industry data that says it takes eight months to discover a breach would be considered "outside" data. If you have data to make you believe you are better or worse than industry standard, that would be considered "inside" data. Seek out both outside data, as well as inside data, and make sure your forecasters understand the difference.
As forecasts occur over time, previous forecasts themselves can be useful as "stand-in" data where no other data exists.
However, data collection that is useful for statistical modeling will quickly replace a forecast, or better inform a forecaster than a previous forecast alone. Always prefer better data.
The Blockchain Graveyard is an example of postmortem data with a low signal of truth. It relies on expert estimation of root cause based on unreliable data. It is still extremely useful as data for forecasting.
If forecasting methods become more common, it will be simple to share forecasts and data for similar scenarios with companies within your industry. Or, for instance, higher risk scenarios or industries. If a university network can produce data that ~%50 of hosts on a dormitory wireless network exhibit signs of compromise over the course of a year, a forefcaster can use that data to narrow their estimations of a managed and invested network.
This is as simple as recording your forecasts into a spreadsheet. A group of forecasters should get a chance to discuss their forecasts and modify them. This will help reveal scenario misunderstandings, or dramatic differences in how evidence was interpreted per person.
Record the average of forecasted values to create a representative forecast.
There is a lot of flexibility in how often you measure forecasts over time, as these are numeric values and you can track their progress very simply. You may want to trigger measurement at certain milestones to ensure that the team still feels progress will be made.
Our database is dumped and exfiltrated during an infrastructure breach this year.
Forecast: Percentage of never happening, happening once, or more than once.
Forecaster | Never | Only once | More than once |
---|---|---|---|
1 | 75% | 20% | 5% |
2 | 80% | 18% | 2% |
3 | 40% | 25% | 35% |
Average | 65% | 21% | 14% |
Should new information come to the forecasters at any time, they may update these scenarios. For instance, let's say your team receives scary results from penetration test. This may cause probability to reduce from the "never" column and move to the "once" column.
Alternatively, a deployment of second factor authentication may move probability to the left, into the "never" column.
This example shows the use of multiple forecasters. A simple average of multiple forecasts has been shown to help smoothen out biases and misunderstandings across multiple people.
Panels of forecasters with expertise in their subject area are a very common risk assessment tool used in many high risk industries.
Diversifying a panel with asset owners, leadership, technical experts, or even outside consultation will help add integrity to the forecast.
Of note, teams of less informed forecasters were used to compete with intelligence agencies armed with classified information, and won by a broad margin. Alternatively, when intelligence agencies that employ these forecasting methods with tight feedback loops see very positive results when forecasting complex scenarios.
For great information on why ordinal scales should not be used, see research done by Sherman Kent which suggested improvements to the National Intelligence Estimate process. The value of estimates made by the entire intelligence community were diminished because of misunderstandings in the words used to express confidence, which were fixed by a quantitative system.
Additionally, see the "Delphi" method, developed by RAND in the 50's, which suggests a "forecast, talk, forecast" approach to a forecast using a panel.
Lastly, you may want to train individuals on different ways to interpret to probabilities they assign in frames of time. For instance, a 10% annual probability should occur every ten years on average. 50% probability will occur once every other year on average. Someone may quickly say 5% without also reasoning that it means a probability of once every 20 years. You can confront a forecaster with that realization and it will help them settle their forecast into a place that makes more sense to them.
When the forecast timeframe "closes", discuss with the team if the event occurred. You can discuss the root cause, update any scenario decomposition, and begin maintaining the event as a form of historical data for future forecasting.
It is critical to hold a review and observe the correctness of forecasts. Reward accurate forecasts and post-mortem forecasts that were high confidence and incorrect.
For example, if an individual or panel states 90% confidence that a system will be breached within the next quarter and is clearly not breached... then this is a topic for discussion and course correction. These misses will damage a panel or individual's Brier score over time.
This method is highly reliant on "hunting", audit logs, and detection. It may warrant a secondary effort to thoroughly review a backlog of activity for a specific type of compromise, thus providing you with confidence that a hard-to-detect event didn't occur.
Whether or not your detection capability or log reliability will be useful in a future incident is also a forecastable scenarios that can be improved through this process.
It's perfectly OK to be skeptical that your efforts to detect an issue weren't thorough. Sometimes we aren't in a place where we can truly rely on our detection capability. In fact, this skepticism is common, and can be tracked by yet another forecast:
We have detected a malicious process running on our bastion server.
Forecast: Yes (55%)
If there is a massive drop off in confidence about your detection capability, you can then pivot and decide if trust needs to be built, or if a different approach altogether is necessary.
In this document, we've measured the outcomes related to an invidual scenario. This is considered to be a single risk analysis.
Eventually, you'll desire having a broader enumeration of scenarios that you'll forecast. This is how risk assessments work at their core, across many scenarios.
Deciding on which risks to tackle first can be as simple a cumulative vote, which represents group estimation as well. When enough resource appears to measure or forecast impact, you can then mature to a numeric approach to prioritization.
This is an draft risk management approach for teams who have outgrown checklists and maturity models. There are plenty of known problems, so please use an indoor voice when pointing them out.