Causal analysis is a surprisingly complex process that over the years has been subject to push and pulls from a wide variety of professional influences. When determining the actual cause of an accident or an incident, any number of stake holders would like to address the issue that “caused” the accident, whether to prevent a reoccurrence or, on the other side of the remedial spectrum, to punish the causal party.
In theory, there must be a cause for everything, but, in practice, finding the cause often involves a high level of judgment, and judgment is complicated and often deeply flawed. If judgment were to reside in one professional community, there is a possibility that some type of convergence on the determination of the cause of an incident could be developed. And once the cause is established, we can give that cause a name. It apparently is too simple to simply call it “the cause” of an incident, and that makes sense, since all too often there are multiple causes of an incident. And because there can be multiple causes, different terms for causes have started to emerge.
Cautious cause analysts may simply want to identify issues (causal issues) that appear to be related to the cause of an incident. Those analysts may identify “causal factors”, anything that, when interrupted, may break the failure chain. Other analysts may want to use the term “proximate causes”, the term almost speaks for itself in that it defines a cause that is close to the incident, but it is difficult to define in a specific sense.
Nevertheless, the legal profession has made an effort to define proximate cause and more or less defines it as “an event sufficiently related to a legally recognizable injury to be held to be the cause of that injury.” The term “legally recognizable injury” already indicates that this cause is related to blame, but the legal profession freely admits that the term proximate cause is notoriously confusing.
There have also been attempts to use the term “root cause”, which is defined as an initiating cause of either a condition or a causal chain that leads to an outcome or effect of interest to the incident. Root causes are related to root cause analysis, where the analyst keeps stepping back in time until the failure path shows the event that initiated the failure.
The problem with root cause analysis is that a failure can have many branches and depending on which branch you follow, you end up at a different root cause. Moreover, for each branch one can ask where the root cause actually is. From an extreme point of view it can be argued that lust is the universal root cause. Without lust there are no humans and if a failure is not an “Act of God” (this is an actual term applied in failure, especially in maritime), lust results in the birth of humans, and therefore it is the cause of all human failure. This is just an example, but it is quite easy to pick any root cause for any failure, and one can argue against any root cause by picking another nearly random root cause, which then, in turn, can be replaced by another root cause and results in a random number of root causes and a basically pointless analysis. Even worse, there are many failure analysts who identify root causes that are not actually causal. In other words, their root causes might appear to have something to do with the incident, but were not involved in the actual cause of the incident.
So as a failure engineer, is cause analysis pointless? Strangely, from an engineering point of view, cause analysis is much more straightforward. (Although it may still be very complex, and that is not a contradiction.)
Engineers are not tasked with being involved in the blame game. Instead they are much more strongly involved in the prevention game. In prevention, an engineer can juggle hundreds of possible causes, but will evaluate each cause on an efficiency level. As such, if there are many contributing causes or causal factors (an even more neutral term), an engineer engages in a nuanced analysis and will try to ferret out the most efficient solutions that prevent the accident and can then refocus on that as a useful cause of the incident. Sometimes (but not very often) that can mean that some type of signage was necessary, sometimes that can mean that a physical redesign is required, and sometimes it involves training.
Once the more efficient prevention approaches have been identified, one may ask: Did these prevention approaches already exist? And did they fail? If the prevention approach did not exist before the failure, one can actually say it was an accident and no blame should be assigned, no matter how much the public screams for it. If prior prevention approaches existed, but they failed, one can call it an accident waiting to happen and it serves to dig a little deeper to see how the prevention approach failed and some level of blame may exist.
Inherent in all of this are two important observations:
Accidents can happen, but make sure you are not part of an accident waiting to happen. That means that if you see, or do, something that could reasonably result in harm to others, pay attention and make a reasonable effort to reduce that harm.
As an engineer, stay away from using the terms “root cause” and “proximate cause”, those terms are laden with poorly defined baggage. Instead, look for causal factors, engage in prevention analysis, and provide a nuanced discussion on prevention of the accident rather than assignment of blame.