A critical appraisal of risk methods and frameworks

This appraisal has been produced so practitioners and decision makers can better understand and work with the approaches available. Readers should note that:

it does not mean that existing risk methods and frameworks cannot or should not be used, or that they are fundamentally ineffective
irrespective of what method or framework is used, if it’s not supported by thought or context, then effective risk management outcomes are unlikely to be realised

General limitations of risk methods and frameworks

This section describes the fundamental limitations that affect all risk methods and frameworks included in this research.

Limits of a ‘reductionist’ approach

Whilst aspects of the taxonomy for various risk assessment methods and risk management frameworks can differ, their features are fundamentally the same and current approaches to risk assessment and risk management often reduce risks to their constituent components for analysis (such as, threat, vulnerability and impact).

This can lead practitioners and decision makers to fixate on the individual risk components to the detriment of understanding the ‘risk writ large’.

This reductionist approach means that the underlying complexity and context – which is necessary to consider when undertaking analysis – can be overlooked. The effect of a risk being realised is caused by some combination of all its components, not by a single element, and a reductionist approach can fail to acknowledge that it is the interactions of these components that realise the risk.

Lack of variety

Most organisations choose a limited approach to mitigating the risks associated with technology (typically a control set as recommended by the risk assessment method used) and stick with it. This limited approach will not have the necessary variety to effectively control all of the different dynamics inherent in modern technology systems. The nature of these technology systems, and the way they are delivered, means that the associated dynamics are far more variable than the means by which they are controlled. Organisations will need to adopt an approach to risk mitigation which has equivalent variety to that of the dynamics associated with the technology systems they use.

Limits of a ‘fixed state’ approach

A technology system that is ‘secure’ today will not necessarily be secure tomorrow. The changeable and complex nature of technology systems means that the corresponding security cannot be in a fixed state. Current approaches to risk assessment and management imply a fixed state of security. Interpreting security in this way can lead to poor decision making and ongoing risk management.

Security should be viewed as a dynamic property of a technology system which should adapt to risk. Organisations need to have a capability in place to observe, detect and respond to risk as it emerges, so that security can react accordingly. Detection and response activities should continue to monitor the state of security in order to maintain a risk management balance across technology systems. This does not necessarily mean the adoption of expensive monitoring solutions; rather that organisations need to establish the technical and non-technical means by which they can detect and respond to risk signals in order to realise a management balance.

Lack of feedback and control

In order to maintain control of security within a technology system, mitigation activities need to include feedback. Security controls need to adapt in response to changes in, for example, threat, technology and business use.

Most existing approaches to mitigation specify the application of a fixed control set which does not consider ‘real world’ security feedback. This feedback is essential for the effective regulation of technology systems. That is, security feedback can inform the amplification of mitigation activities in situations where increased assurance is required, and the dampening of mitigation activities in situations where they are becoming excessive.

Losing risk signals in the ‘security noise’

When using any risk method or framework, real risk signals can be lost in the ‘security noise’ caused by normal system operation, information opacity, misguided analysis or bias. The problem of security noise can be compounded in situations where corresponding metrics are produced, as this implies validation. It is commonplace for these metrics to then be used to support decision making.

Practitioners and decision makers should therefore be aware that security noise can manifest itself as a false positive or risk signal, and develop the means of determining, accessing and validating risk signals as part of their assessment and analysis activities.

System operation

When a system is operating under normal working conditions, security noise will be generated. For example, it is normal for systems connected to the Internet to be scanned, and it is normal for users to make mistakes when entering their passwords. These normal security occurrences can create ‘noise’ that can make it difficult to see the occurrence of real risks or abnormal activity, such as the organisation being specifically targeted by attackers. If security noise is not considered in this way and filtered out, it can result in the real risk signals being missed.

Information opacity

The term ‘information opacity’ is used to describe a situation where risk signals are filtered or modified as they travel through the structure of an organisation, so that the understanding is lost. Practitioners and decision makers can become detached from the risk signals present because of the policies and processes that are in place, the ‘depth’ of the workforce, or the complicated nature of working practices. This can make it difficult to appreciate the actual risk signals that are present.

For example, security events can provide a useful source of risk signals in support of assessment, but unless they are experienced first-hand by the practitioners and decision makers (or are documented at source with a high degree of accuracy and accessibility), the understanding can be lost in the compensating security noise.

Noise from misguided analysis

Security noise can also be generated by misguided analysis based on poor scoping and modelling. The risk scope or model is often not an accurate reflection of the real world; for example it may be static in nature, it may not account for change, or may be based on incorrect assumptions. In the worst cases, the scope or model may only contain security noise, rather than the risk signals that could affect the security of an organisation’s technology systems.

Consider a scenario where practitioners and decision makers decide to produce a scope or model of a technology system in support of compliance activities. As a result, their view of risk is limited to that scope or model, overlooking the risk signals that are potentially present in the technology systems that the scope or model is reliant on.

Noise from bias

Security noise can also be generated through bias. For example, the latest vulnerability headline in the news can bias practitioners and decision makers so that operational teams are encouraged to focus on this security noise, causing them to overlook real risk signals associated with applicable vulnerabilities.

Assumed determinability

Irrespective of the risk method or framework used, practitioners and decision makers can believe that it is possible to predict, with a level of certainty, the causes and effects of all risks. As a result, probability can be overlooked or poorly considered.

The outputs from risk management methods and frameworks can be considered to be predetermined, because presumptions are made about the causes of a security risk with threats taking specified forms, vulnerabilities being of a certain type, and impact being business-related.

It is implied that the same inputs will always lead to the same outputs (determinism); however this is not necessarily the case, because we are dealing with the interaction or people and technology. There will always be some level of uncertainty about the outputs from any risk assessment and analysis technique.

Where probability is being considered, it needs to be validated, for example, with hard data and expert opinion. Past events are not always a good predictor of future events, and a statement of probability can bias practitioners and decision makers, leading them to place unfounded confidence in its predictive abilities.

For example, a statement of risk probability (say a 1-in-100 year event) because of its implied surety, can influence an approach to mitigation which focusses on mitigating an ‘unlikely’ event to the detriment of complimentary activities (such as prevention, response and recovery). Probabilities should not be received as statements of surety; they are a means of reducing uncertainty in support of risk management decision making, rather than eradicating it.

The complex nature of the technology systems used today means that risks will emerge that were not previously anticipated through assessment and analysis techniques. Failing to recognise this uncertainty (and the non-deterministic nature of risk) can lead to complacency amongst practitioners and decision makers, preventing them from preparing for emergent or changeable risks.

Specific limitations of risk methods and frameworks

This section describes the more specific limitations that affect all risk methods and frameworks included in this research.

Abstraction through labelling

Risk methods and frameworks use labels to name, number, order and measure the components of risk. This act of labelling introduces a layer of abstraction which can conceal the subtlety and complexity of risk. As a result, the necessary context for meaning is lost, and it becomes more difficult to arrive at a consistent understanding. In addition, the type of labels used in risk management can promote bias in practitioners and decision makers.

For example, a qualitative approach where a threat is labelled as ‘high’ conceals both the nature and capability of that threat. This act of labelling will be interpreted in different ways by practitioners and decision makers. Equally, a quantitative approach which measures and labels financial loss as a £1,000,000 will conceal the impact of said loss. Again this act of labelling will be interpreted in different ways by practitioners and decision makers.

Numerical labels are often perceived as being more reliable because they give the appearance of rigour. Whilst this is sometimes the case, numerical labels can promote bias because they are received with more confidence. For example, a risk which is labelled as 60% probable, will instil a greater sense of surety in practitioners and decision makers, than if it were labelled ‘medium-high’.

The limits of using matrices

Many risk methods and frameworks use matrices to combine the input of risk components to produce risk values or statements. Prior work has shown that, though convenient and quick to use, using matrices to produce outputs that inform management decisions can belie the complexity of technology systems, and therefore the true nature of the risks associated with them. Matrices can hide the function used to combine input components; this makes it difficult for the validity of the risk output to be determined. Additionally, the function used can vary, resulting in inconsistency of risk outputs. For example, two different matrices could combine risk components in different ways, yet produce identical-looking risk outputs both labelled ‘high’.

Furthermore, the use of linear scales for risk inputs and outputs in matrices can obscure the non-linear nature of risks that are affected by determining factors such as time. A ‘high’ risk, for example, may not be proportionally more severe than a ‘medium-high’ risk than a ‘medium-high’ risk is to a ‘medium’ risk.

Limits in the way uncertainty is presented

Risk values or statements produced by methods and frameworks do not effectively communicate the uncertainty that is associated with them. As a result, these risk values or statements are visualised by practitioners and decision makers as being a certainty, rather than a probability. Whilst some methods and frameworks include values or statements of likelihood, the way this is typically included as part of the risk output masks the uncertainty. In addition these values or statements are predetermined input components, rather than actual probabilities.

Practitioners and decision makers should be aware that the risk outputs from methods and frameworks are not values or statements of probability. Furthermore such risk outputs can compound an assumption of determinability, making it difficult for practitioners to communicate (and for decision makers to visualise) uncertainty.

The effect risk relationships have on impact

The complex nature of technology systems means that there will be relationships between the risks. Amongst the large amount of information generated by risk methods and frameworks, it’s easy for the existence of any relationship between risks to be concealed from practitioners and decision makers. These relationships can result in impacts that are quite different from when risks are considered in isolation.

For example, consider an organisation that has identified the following three risks in a risk assessment:

a distributed denial of service (DDoS) attack against its web services, impacting sales orders
a failure of payment transaction integrity, impacting profit
unpatched desktop applications, impacting the integrity of the local working environment

As is usual, the practitioner presented the decision maker with a large prioritised list of technical and non-technical risks, which included these three. As a result these risks were considered discretely with separate projected financial and non-financial impacts.

However, let’s now consider a scenario in which a staged DDoS attack is launched against the organisation’s web services, preventing customers from placing orders. At the same time, a document about DDoS protection (containing malware intended to affect the integrity of payment transactions) is emailed by the attackers to members of the finance team, who subsequently opened it using a vulnerable application.

As a result of this single sophisticated attack, three risks, which were considered to be unrelated, have now been realised at the same time. There is now a greater compounded impact on the organisation compared to each risk considered in isolation. These relationships between risks and the resultant compounded impact was not reflected in the output of the assessment because of the way in which it was presented to the decision maker.

Practitioners and decision makers should therefore seek to identify any relationships that exist between risks and understand how these affect estimations of impact, or other projections they have made.

The adverse effect of intervention

Taking action to manage risks can lead to the emergence of new risks and impacts. The adverse effect of intervention is rarely catered for in risk methods and frameworks, or considered in practice by practitioners and decision makers.

For example, consider an organisation that has identified the risk posed by a malware outbreak to its computer systems. As in the previous example, the practitioner presented the decision maker with a large prioritised list of technical risks, non-technical risks, and the corresponding recommended mitigations, without considering the effect they would have on other things that the organisation cares about. As a result, the organisation decided it would isolate affected systems in the event of a malware outbreak, in order to contain it.

Whilst this initially seems like an appropriate course of action, this intervention could have adverse effects that were not previously considered. For example, because the organisation used the same infrastructure for both voice and data communications, this intervention (i.e. isolating the computer systems), would have an adverse effect on users who would no longer be able to make or receive phone calls. Intervention to manage risks can have adverse effects, resulting in new risks and impacts that were not previously considered.

Impacts are not limited to the scope of assessment

When conducting a risk assessment it is normal to define a scope. The assessment and analysis that follows will, understandably, only consider the impacts on assets that are within this scope. However, the increasingly complex and interconnected nature of technology systems means that the impact of a risk being realised can extend far beyond that which is scoped. As a result the impact of the risk can be far greater than that which was originally assessed and analysed, and affect other people, organisations, business goals or priorities.

This has been illustrated by recent high profile cyber-attacks, for example the attacks on Talk Talk, Sony and ALM Inc. Whilst the things traditionally considered to be in scope for a risk assessment were impacted (i.e. technology systems, business information and business reputation), subsequent investigations and media reports have shown that the actual impacts have extended beyond those things to affect the personal lives of customers and employees, causing them significant harm, loss and embarrassment.

Practitioners and decision makers need to appreciate that there are limitations (and thus uncertainty) when assessing and analysing impact, and that the true impact of a risk being realised can extend far beyond the scope of assessment.

The effect of time on risk

Many risk methods and frameworks require estimations (such as impact, or the capability of threat) made at the outset. However, this approach does not consider the effect time has on the components of risk, and thus the continued effectiveness of the management approach.

Consider an organisation that suffers financial losses as the result of a technical outage. When the risk assessment was conducted, it predicted that the impact of such an outage was recoverable, so mitigations were established accordingly. However, as time passes and the outage persists, the impact on the organisation reaches a point where financial losses are no longer increasing at the previously predicted rate. As a result the eventual financial impact of the outage means the business can no longer recover, regardless of the mitigations employed.

Or consider an organisation that wants to protect itself from an attacker who has the capability to detect (and then exploit) vulnerabilities in the organisation’s websites, following their public disclosure. Following an analysis by the practitioner, organisational policy would be to update websites within two weeks of a patch being made available by the vendor. The practitioner at the time believed that attackers would not have the capability to detect and exploit vulnerabilities within this time frame.

However, the practitioner had not anticipated the speed at which exploit kits are updated with the ability to detect and exploit the latest publicly disclosed vulnerabilities. This, combined with the fact that exploit kits are freely and easily available, resulted in the threat being more capable than was originally assessed, meaning the organisation’s websites are now no longer adequately protected.

Practitioners and decision makers should therefore ensure that the effect of time is taken into account and that risk understanding is current, so that management activities continue to be suitable.

Source: NCSC

With over 20 years of experience, Serviceteam IT design and deliver sophisticated connectivity, communication, continuity, and cloud services, for organisations that need to stay connected 24/7. We take the time to fully understand your current challenges, and provide a solution that gives you a clear understanding of what you are purchasing and the benefits it will bring you.

To find out how we can help you, call us on 0121 468 0101, use the Contact Us form, or why not drop in and visit us at 49 Frederick Road, Edgbaston, Birmingham, B15 1HN.

We’d love to hear from you!