Posts Tagged ‘Risk Matrix’
I can’t rule out that I had already viewed this presentation and the words pathways and proximal became lodged in my mind – seeds sown to sprout some distant day in the future. But upon reading this document (again?) I was struck by the apparent similarities with my proposed risk evaluation methodology, which was the subject of much ranting a few weeks ago – here, here, here, here and here.
Specifically, I’m talking about these slides:
Seeing these concepts pop up in a presentation by Professor Reason really made me feel like I am on the right track1. However, I still have some work to do.
On my to-do list is to figure out how to match the likelihood scale to the new dimensions. Describing likelihood in terms more suited to events doesn’t really hold for the latent type of risk condition. That to-do list is pretty full though, so it’s only a short post today.
1. Yes, this could be just a case of confirmation bias
Since I don’t just want to be thought of as some PIG-hating obsessive lunatic, lets wrap this thread up for the moment.
Quick recap: The traditional likelihood-consequence matrix (PIG – see original post) is not particularly useful when dealing with aviation safety. Why? Because a graduated consequence scale fails to recognise the perilous nature of aviation and consequence as a dimension isn’t particularly useful when evaluating latent conditions remote from the ultimate outcome (death by aviation).
Alternate approach: Instead of scoring the consequence directly, I’ve offered two alternative dimensions under the generic title of influence1 – proximity and pathways.
In wrapping this up, I thought I would discuss what I think is the rationale behind this approach of using slightly off-centre indicators.
Obviously, it would great to have a complete and high-fidelity model of aviation accident causation. Something which showed all the risk conditions, variables, relationships, etc. A model to such a level that the ramifications of the slightest trend could be calculated automatically. Unfortunately, it doesn’t seem to exist or at least, I don’t have one.
The implausibility of such a model is why we have risk management. After all, risk is “the effect of uncertainty on objectives”.
That is why the single score approach contained in most PIGs seems a contradiction in philosophies. To me, it attempts to apply certainty without telling us where the uncertainty has gone. I’m not sure that makes sense but please go with it for a moment.
What I’m trying to say is that using the traditional PIG, I attempt to assign single score X to condition A. Where did the uncertainty go? In short, it is still there and that is the root of a few of the problems I’ve mentioned in my last couple of posts. Especially, the problem of what to score – most likely, worst credible, worst case, etc.
What I’ve attempted to do is retain the uncertainty but keep it out of the scoring process. The proximity and pathways scales are, of course, indirect indicators of something bad happening. There is no guarantee that a risk condition directly connected or with a significant number of connecting pathways to the ultimate outcome will lead to utter catastrophe – but they are variables worth considering.
The uncertainty exists between the scale and the reality. The scoring can be carried out with some degree of confidence according to the scales chosen and the definition of the accident scenario.
Obviously, there may be plenty more such scales. The above two are just the ones that came to mind first – if you can think of any others, I’d love to hear your ideas – please comment.
There is more work to do on this idea. Such as, what other variables are required to support the decision-making process and is likelihood, probability or frequency the best indicator for presence of a risk condition? And so on. But I didn’t want this blog to be all about PIGs or matrices or risk management necessarily.
Next week? My page is blank, I hope I don’t get writer’s block.
1. I might change this label. I really suck at naming things except my kids, their names are awesome ;)
Near the end of my last post, I used the Swiss-cheese model to highlight that many risk conditions1 worthy of attention are not necessarily proximate to the ultimate outcome. I also hinted in the post before that, that I thought this to be only half the story.
To tell this story, let me introduce another accident causation modelling technique. It is called an AcciMap and it is gaining popularity because it offers a way of representing the relationships between events (these being things such as decisions, functions, tasks, actions, etc.). An AcciMap is set up in two dimensions with vertical lanes separating system levels of increasing generality as you move up and the horizontal axis having no fixed dimension or scale. The system levels begin very specific to the accident in question with equipment and actor activities making up the first two levels. The higher levels relate to organisational, regulatory authority and government policy and decision making.
Here is a poorly drawn adaptation of an AcciMap:
If proximity was the only consideration then the top event and the limited emergency response equipment would be highest risk conditions. They are sitting right next to that big “ouch” so they must be the biggest problem.
But what about those inappropriate budget cuts? A decision like that has wide-reaching effects with most of them hidden until it is too late. I’ve started thinking about risk conditions such as this as having multiple pathways to the ultimate outcome. Therefore, they are just as important as those risk conditions which are in close proximity to the ultimate outcome.
Influencing Outcomes through Proximity & Pathways
So, where I’m going with this? I am recommending that instead of a straight consequence dimension, those conducting safety risk evaluation within a complex socio-technical system use an influence dimension made up of two scales – proximity and pathways. These scales can be defined as:
- Proximity – relating to the number of discrete risk conditions between the condition being evaluated and the ultimate condition.
- Pathways – relating to the number of pathways, via which, the risk condition being evaluated may lead to the ultimate condition.
Having multiple scales on one dimension isn’t unusual but the above approach is a little different.
Where as the typical implementation of a multi-scaled dimension consists of different types of consequences (political, economic, reputation, etc.), the above approach is solely about the safety consequence. Therefore, you can’t really stick these two scales into a common matrix as they sit at a different level to the standard scales.
They also differ in that they relate to the risk condition and not the potential outcome. As the outcome has already been defined as utter catastrophe, the focus has been turned toward the risk condition. And to me, that seems quite intuitive and reasonable.
These differences mean that when combined with some form of frequency or likelihood dimension2, we end up scoring the risk inherent to the risk condition. Of course, you can show this is a matrix but I think there is more to this story.
Hopefully, next time, I’ll get this under control and tie it all together…
1. I am loathed to just say risks. To me whenever one uses the word “risk” it should be followed by “of” – for example “the risk of a runway excursion is high due to high crosswind, poor surface friction and large jet aircraft”. It is always difficult to discuss a concept without a strong standardised lexicon and the last thing we need right now is another term introduced by some opinionated blogger but… I can’t help it. People refer to a variety of, what I have come to call, conditions when they describe risks – they mention events, hazards, situations, mental states, failures. My intention is to accommodate all these under the one name, risk condition.
2. I’m not sure which to use yet. That problem is for this week’s idle moments
In continuing this little series I’ve got going here, I’d like to just quickly go back over a couple of points from last time. I’m trying to keep these posts relatively short. So that means I may have moved on to my next point a little too quickly.
I guess the crux of the last post was that a graduated consequence scale is inappropriate in an aviation safety context. My two main points to back up that statement were:
- the potential for a catastrophic event is persistent to the primary aviation activity of flying from A to B; and
- that given aviation is a complex socio-technical system, risk conditions (call them hazards, events, or even just risks) upstream of the ultimate condition (death by aviation) cannot be categorised effectively.
I tried a few of these arguments out on some colleagues and they seemed unconvinced. So, I’m going to work on them a bit more here – this blogging thing is much more for my benefit than yours but thanks for stopping by anyway ;).
One step back…
I made two objections to my vulnerability argument – the variety of outcomes flowing from common risks and that the outcome of a risk may vary with the aircraft size/occupancy. My responses to these points were brief. Probably too brief but this is meant to be a blog, not a dissertation. Let’s go over them again.
I don’t want to simply re-state my last post but the concept that catastrophe could have occurred because there exists no inherent limit to the consequence below this, is my best point. But let’s look into it a bit further with an example, a runway overrun.
The vast majority of runway excursions do not end in death but was this because of some recovery measure which set an absolute maximum to the consequence? I don’t think so, in fact, I think it was simply a further reduction of the likelihood of a completely catastrophic outcome – and now we have introduced likelihood into the consequence side of the equation. Is this complexity my own doing? Am I over-thinking this? Probably, but bear with me, please.
We seem to be back to an argument I put up in my first post on this issue. Risk, in an aviation safety sense at least, is not a discrete score – it is a continuum. At the very end of that continuum, always, is the most final of all outcomes. It may be have a very small likelihood attached but it is always there – persistent vulnerability.
Now again, I hear you saying (or they might be the voices in my head), but the aircraft occupancy may vary. Yes, you could construct a matrix with the consequence dimension graduating from one death to 500 deaths as required and such a matrix would have its uses. This type of matrix could be used to distinguish between the risk posed by individual operators or sectors of the industry for a variety of purposes such as surveillance planning, high-level regulatory standards development or safety performance measurement.
But it would not be useful within operational safety risk management – by that I mean, when you get into the operational sphere of stuff happening, this type of matrix wouldn’t assist in the decision-making process when one designs and implements safety measures. (I don’t want to just drop this dimension – it is important and it will pop up again later.)
The matrix you have in the above case only tells you about the risk associated with the final outcome. It does not assist in assessing risk conditions upstream.
So what do I mean when I say “upstream”?
Aviation has a plethora of accident causation models. They have their differences, their pluses, their minuses and, of course, their similarities. I think I can say that the one thing all modern accident causation theories agree on is that accidents are never caused by a single act. They are the coming together of many acts with some being quite remote from the accident in terms of both space and time.
For this post, I’m going to run with the ol’ tried & true, Swiss-cheese model1. It’s not my favourite but it is well-known and serves my purposes here.
What the SCM brought to the world was the awareness that decisions made at the top of an organisation have an impact on frontline safety. When combined with the knock-on and discrete effects from all other levels of the organisation, one could say that, in some circumstances, the frontline operators were doomed from the beginning of their operation.
Examples of these latent conditions include decisions to reduce the maintenance budget, outsource certain functions and even more obscure concepts as failing to inculcate a strong and positive safety culture. How does one risk assess conditions such as these? The link to any tangible consequential outcome is extremely tenuous even with all the accident reports which cite contributory factors such as these.
So now its time to think of solutions and last time, I said I thought there were a couple. I’m still working on a couple of these ideas but they will have to wait until next time – I’m already way past my target word count.
More to come…
1. This paper is a critique of the model by a Eurocontrol team which included the inventor, Dr James Reason. It is a good read.
In my last post, I commenced a whinge about the PIG or as it is more commonly known, the likelihood-consequence matrix. I signed off that post with a promise to further the discussion on the risk matrix within an aviation safety context.
Consequence is an inappropriate dimension to consider in aviation safety. For two reasons which I call vulnerability and proximity. Let’s take them in turn.
Aviation is a perilous undertaking. Every time you take to the sky you tend to inject enough potential energy into the situation that no amount risk mitigation can stand between you and catastrophe1.
In other fields, a graduated scale of consequence may be appropriate. Finance, for example, can easily delineate between monetary outcomes when limits can be set by how much you put into an uncertain situation. In aviation, you are all in.
Okay, there may be a few readers wishing to interject at this stage. I’m going to take a guess at two counter-arguments to the above position2. The first being that aircraft sizes/occupancies vary. The second is that many, many, many occurrences do not result in total annihilation of hundreds of passengers.
Let’s take the second one first. The “but” that I would like to throw in after that sentence is that in everyone one of those near-misses, minor incidents, major incidents or even non-fatal accidents, catastrophe could have occurred. There was no inherent limit within the situation that meant complete loss of life was not a possibility.
Back to the first point now. Yes, you could limit the amount of life lost by limiting the number of passengers. This method of segregating risk decisions appears throughout the aviation safety sphere – the certification of aerodromes is a good example, the requirements kicks in at aircraft with more than thirty (30) seats. If you were to insert this into a PIG with “death of x number of people” along the consequence dimension, all you would end up with is a 2-D matrix of accident frequency acceptability/unacceptability.
And this leads into proximity…
The “risks” we tend to look at within the aviation safety realm are quite varied. One second we might be considering the chance of an engine failure and its impact during Extended Diversion Time Operations, then we’ll be looking at the impact of a poorly maintained operations manual and following that up with an assessment of an ineffective hazard reporting system. Each of these conditions falls in a completely different area of the accident causation chain.
I’ve started to think about this problem as proximity. How close is this condition to the ultimate outcome? Obviously, conditions closer to the end result are more important and things further upstream are less so, right? I think we start to hit another issue here and its one I’m working through at the moment and hope to write about next week.
But before I go, I do want to sum up the above rant.
I believe that the traditional likelihood-consequence matrix is not suited to risk management (assessment/evaluation) within the aviation safety realm. A graduated consequence scale with anything less that complete loss of life fails to recognise the persistent potential for catastrophe and a graduated scale based loss of life limited by aircraft size cannot be applied to conditions (“risks”) which exist upstream of the final stop of the accident causation chain.
I think there is an answer to these problems. In fact, I think there are a couple. Stay tuned.
1. Until that is, Q unveils his inflatable aircraft cocoon - something like this.
2. If you have any more please feel free to comment.
You can’t swing a euphemism without hitting one when you’re playing in the risk management metaphor. They’re everywhere. Whenever you start looking at anything risk management related, you are sure to find a PIG.
PIG stands for Probability-Impact Graph – otherwise known as likelihood-consequence matrix or frequency-severity chart or some combination of these words. I’m most familiar with the LxC matrix label, so I’ll use it from here on in.
Over the past year or so, I’ve being growing more and more uneasy with the application of this tool within the aviation safety environment. I wasn’t seeing, however, the same discontent in others and therefore, started to doubt my own reservations. Luckily, I found some like-minded people over at LinkedIn (membership to both LinkedIn and the group are required to view the actual discussion) with a Mr Stephen Cresswell putting his thoughts on paper here.
My new best friends have identified a range of issues with the PIG, some of which apply to other applications and some of which are very similar to my concerns.
So what are my concerns?
The first one is to what do I apply the score – do I apply it to the hazard, the event or the outcome? For me, the outcome always seemed wrong because the consequence is contained within its definition thus it negates the need for that dimension of the score. The event gives you good opportunity to attach a likelihood of it occurring but what about an event with a variety of possible consequences or causes (hazards)? And for hazards, is it likelihood of existence or some consequential event and here we go wrapping ourselves up in knots.
Example time: Let’s have some evil birds hitting lovely, peaceful planes1. On an airport, birds tend to cause a bit of stress in their operator’s lives. How does one risk assess this problem?
Do you calculate the likelihood & consequence of the bird-strike event? Seems simple enough but how to you account for different birds in different areas affecting different phases of the aircraft’s flight? Do you then apply the calculation to each bird species? How do you distribute this score across the possible outcomes?
And that brings me to my second beef with PIGs – risk is not a discrete score.
If risk is indeed a combination of likelihood and consequence, in the aviation safety context, I don’t see how it cannot be expressed as a discrete score. The risk of a bird-strike is a continuum. Most of the time, i.e. high likelihood, the consequence will be minor or negligible (near-miss or small collision). Some of the time, i.e. lower likelihood, the consequence will be major (something broken) and on rare occasions, i.e. really low likelihood, you’ll get a major media event.
So what do you score? The most likely consequence, the worst case the scenario, the most credible outcome, etc. etc. etc.?
For my last point, I’ll steal directly from Mr Cresswell:
PIGs take a simplistic view of risk in which there is no system-based thinking about the relationships and dependencies between risks, opportunities and uncertainties.
Aviation is an extremely complex socio-technical system – it’s the relationships that matter. Treating each “risk” as a separate line item with its own discrete score doesn’t mesh with our thinking in other areas – especially, accident causation theory and the overall safety management system concept.
I’m going to try to develop these ideas over the coming weeks (with more regularity than to date) – stay tuned.
1. Last year I posted this on bird-strike risk management. I even used a PIG approach at the more strategic level but dropped it for the specie-specific risk assessment, instead I opted for a completely different approach.