Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Working in the design of large scale engineering systems, and watching safety / risk reviews, its a little disturbing to read this, and note that almost all hazard and safety analysis performed goes almost completely counter to these concepts.

It generally assumes there's one super-cause, and maybe some things that contributed. (Usually even pre-specified as "root cause" analysis when trying to find a problem)

The ultimate (although unstated) goal is almost always to find out how a specific person messed up, and then note what they did wrong. (Kind of human nature)

The culture usually assumes humans are inherently unsafe (ie, they don't create safety), and we're protecting them from themselves. (Does probably meet the statement that complex systems are heavily layered with protections against failure)

It often assumes that we can achieve a level of omniscient safety, where no-one is ever unsafe and we see all problems before they occur (safety culture names that imply "less than zero problems" or "we make you safer working here").

The probabilistic nature of accidents is not acknowledged, and its usually whack a mole instead. (This often ties in with the hindsight bias to note how a practitioner messed up the perfect safety system)

Problem is, I'm not sure how you would actually implement a good, probabilistic safety system that largely keeps people safe, but acknowledges bad, random things occasionally happen, and that line folks are your best defense for seeing and stopping it. Its counter to the whole leadership meme of decisive action and quick resolution to project strength. Its not very satisfying to hear "we could have spend $1M more on our safety program, but Bob still would have been burnt because it was due to three unlikely things occurring in quick succession."



Problem is, I'm not sure how you would actually implement a good, probabilistic safety system that largely keeps people safe, but acknowledges bad, random things occasionally happen, and that line folks are your best defense for seeing and stopping it.

Through the engineering process though you can generally have an idea where your weakest/unsafe points are based on previous studies. I see no reason that one couldn't stack those failure points into a probabilistic matrix and then apply mitigation methods around those points.

The acceptance of random failure as something largely unavoidable though is something that can't be engineered away it's a human trait. Just as tire blowout on an 18 wheeler doesn't necessarily mean you failed in safety design for that tire, the subsequent balance load shift is the un-recognized catastrophe mitigation built in to the system. Yet people will still focus on the tire.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: