Postmortem Action Items: Plan the Work and Work the Plan
John Lunney, Sue Lueder, and Betsy Beyer
In the 2016 O’Reilly book Site Reliability Engineering, Google described our culture of blameless postmortems and recommended that operationally focused teams and organizations institute a similar culture of postmortems in their approach to production incidents. A postmortem is a written record of an incident that details its impact, the actions taken to mitigate or resolve it, the root cause(s), and the follow-up actions taken to prevent the incident from recurring. The chapter “Postmortem Culture: Learning from Failure” describes criteria for deciding when to conduct postmortems, some best practices around postmortems, and advice on how to cultivate a postmortem culture based upon the experience we’ve gained over the years.