“You’re a hero one day, you’re a villain another day.” — Vincent Tan
Every good story is a story of conflict. It has a villain. It has a victim. And in the best stories, it has a hero. The story of a process incident or scenario is no different.
It has villains—causes. It has victims—receptors, such as personnel, the community, and the environment. It has heroes—the safeguards and independent layers of protection that defend against the villain.
In some stories of process incidents and scenarios, these roles are played by human action.
The Roles of Human Action
There are three distinct roles played by human action in process incidents: “villains”, “victims”, and “heroes”. The role of villain is the part error plays as an initiating cause. The role of victim goes to people, either within the fence line or beyond the fence line, who suffer the impact of the incident. And the role of hero is the part of independent layers of protection that depend on human action, which can be either administrative controls or operator response to an abnormal condition.
Villains
Errors can and will occur at any point in the safety lifecycle where human activity is required. To consider errors, however, it is important to understand the nature of errors. Trevor Kletz characterized errors as one of three types: lapses, mistakes, and violations.
Lapses. Lapses occur despite an operator’s best intentions. They know what they should do. They are able to do what they should do. They want to do what they should do. And yet, they do not do what they should. Why? Many reasons are frequently cited; inattention, distraction, and competing priorities are just a few. What they have in common is that the operator is highly trained and skilled, and is relying on that training and skill. An external factor then derails the operator from their normal train of thought. Often, the operator will get back on track, without incident. Sometimes they will not.
It is not wrong for operators to rely on their training and skill instead of relying on the heightened state of awareness that it takes to avoid inattention and distraction. It is exhausting to remain in a continual state of high alert. Any risk reduction strategy that calls for a permanent state of high alert cannot be sustained.
Lapses occur, not because operators are poorly trained, but because they are well trained. More training is not the answer. Neither is a reprimand or other disciplinary action. Lapses occur despite wanting to do what should be done. While it could be argued that disciplinary action can be justified in terms of increasing the desire to do what should be done, it will not address the failures that occur despite the purest of desires. The only response to lapses is to change the system so that a lapse has less impact, or to accept the lapses.
Mistakes. Mistakes also occur despite an operator’s best intentions. They want to do what they should do, but they do not know what they should do or if they do know, they are unable to do what they should do. In this case, disciplinary action is still not the answer. Should they be punished for not knowing what to do or being unable to do it? The responsibility for that falls elsewhere. Training and practice, however, will resolve the issues of not knowing what to do or not being able to do it.
Violations. Violations occur when an operator decides not to follow the designed work process. This may be because they are evil—lazy, malicious, greedy, or any of the other human character flaws. It is possible. It is also possible that an operator decides not to follow the designed work process with the best of intentions. Before determining how to respond to a violation, it is important to understand why the violation occurs.
If an incident investigation reveals that the intentions of a violation were good or neutral, then the best response is to modify the training or the work process design to bring intentions and work process into alignment. If the investigation reveals that intentions were evil, the work process should be revised to be indifferent to character flaws, or the operator committing the violation should be removed from the work process—neither work processes nor training can fix character flaws.
Victims
The role of victim goes to receptor in an incident or scenario. Becoming a receptor requires no error on the part of the victim, so a discussion of the role of victim is not a discussion of error. Instead, it is a discussion of occupancy, of being there.
Victims can include both personnel and members of the community. While the environment and assets always present, the presence of potential victims can be highly variable. The probability of their presence (or absence) can have a significant influence on risk.
The Buncefield oil storage depot explosion and fire in 2005 has been described as “the largest seen in peacetime UK”. Yet, there were no fatalities because the explosion occurred shortly after 6 am on a Sunday morning. While many are quick to point out that the death toll would have been much greater had the event occurred at rush hour on a weekday, the incident also illustrates how the absence of receptors also means the absence of impact—at least in terms of victims.
Most organizations treat the community as an uncontrollable receptor that is always present. The presence of plant personnel, on the other hand, is something that can be and is controlled for many reasons, including risk reduction. An isolated facility that is unoccupied and operated remotely may be subject to asset risk and environmental risk, but it is not subject to safety risk for personnel. In fact, that might be the very reason that it is isolated, unoccupied, and remotely operated.
Reduced occupancy reduces risk whether in total or in part. Some organizations are unwilling to credit limited occupancy. The argument frequently comes down to fear that the normal response to an abnormal condition—sending maintenance out to check on it—will assure that the hazardous event will be occupied. When the initiating event is the failure of a piece of equipment with lots of warning, so that maintenance personnel are there trying to affect a repair, then the occupancy factor for that event will be 100%. When occupancy is limited and the nature of the initiating cause is such that occupancy is not influenced by the event itself, then occupancy may be much, much less.
Heroes
I mean the term “heroes” to describe any human action that reduces the likelihood of a scenario or the severity of an incident. I do not intend for it to mean heroic action, or for that matter, to encourage heroic action. Responsible companies should actively discourage “heroism”, because attempts at heroism in the workplace are more likely to result in additional victims than heroes. Nonetheless, there are standard, non-heroic measures that can be carried out by personnel that have the effect of reducing risk.
There are two ways that human action can reduce the likelihood of a scenario or the severity of an incident: administrative controls and operator responses.
Administrative controls. Administrative controls are procedural measures that rely on human action. These procedural measures are done routinely to prevent a hazard, not in response to an initiating event. The best administrative controls are audited. This includes monitoring the performance of the procedure to verify that its performance is truly effective. It also includes training. An administrative control that is credited in a LOPA as being necessary to reduce risk to a tolerable level is a safety critical procedure. Refresher training on the procedure must be frequent enough to assure that it performs as intended.
Operator responses. Operator responses differ from administrative controls in that they occur on demand, in response to initiating events, rather than routinely. The effectiveness of an operator response depends on sufficient training to avoid mistakes (which result from not knowing what to do) and violations (which result from not believing that the designed response is necessary or appropriate). It also depends sufficient time to respond.
This means time to detect the unsafe condition, time to decide what to do, time to act, and time for the action to take effect. The sum of this time must be less than the process safety time—the time from initiating event to hazardous event. In addition to sufficient response time, there must be sufficient buffer time. When there is no buffer time, the response time is equal to the process safety time, and all that is required of the operator is perfection. Perfection is an unreasonable expectation under any circumstance and wholly untenable in an emergency.
Understanding the Roles of Human Action
When it comes to process incidents and scenarios, there are many roles that people will play. Whether as villain, victim, or hero, human action has a profound influence on the likelihood of any scenario, on the severity of any incident. Managing that risk requires understanding those roles.
Every story has its villains and its victims. Otherwise, there is no story. Every story with a happy ending also has its heroes. Understand those roles, and you will make your process story one with a happy ending.
This blog is based in part on an article, “Villain, victims, and heroes: Accounting for the roles human activity plays in LOPA scenarios” published in Journal of Loss Prevention in the Process Industries on July 2014. For those that are interested, the article goes into more detail on strategies for managing these roles.