A Post Incident Review Review

Friday, December 09, 2022 - 3:20 pm4:20 pm AEDT

Tom Partington, ANZx

Abstract: 

Our post incident process is a little different to most, and mainly because of what it doesn't include rather than what it does.

We don't identify a root cause, we don't create or track action items, and we don't report on incident counts or MTTRs. We also work in a highly regulated industry, in a 1000+ person organisation, and repeat incidents are rare.

In this talk I'll discuss how we developed this process, the reasons why, and look at the safety science behind the concepts.

Tom Partington, ANZx

Tom has held many different titles, the most recent of which is Site Reliability Engineer, but despite what the job was called the work has generally remained the same. Trying to keep systems running with sticky-tape and glue (and not always successfully). Understanding how things break became somewhat of an obsession and he fell deep into the safety science rabbit hole where he discovered that many other industries have been studying accidents and human performance long before these computer things became popular.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {284955,
author = {Tom Partington},
title = {A Post Incident Review Review},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec
}

Presentation Video