Connie-Lynne Villani, Grilled Cheese Invitational
Post-mortems are a great start to incident analysis, but are they always necessary? How do you sift through the information to produce a good incident report that people will actually read and act on? Do you even need to produce an incident report, or can you just make a fix and get on with life? What are you forgetting to include, and what are you including that isn’t necessary?
This mini-tutorial will help you answer these questions, with an emphasis on:
- How to prepare for incident analysis before anything goes wrong
- Learning from systemic failure
- Anatomy of a good written incident analysis
- The myth of root cause analysis
- Conducting an incident review meeting
- Studying both "what went wrong" and "what went right"
- Developing a culture of responsibility without blame
Who should attend:
Sysadmins whose job focus is site or application stability, or anyone who's ever had to explain "what went wrong."
Take back to work:
After this tutorial, attendees will take with them:
- Real-world examples of good incident reviews
- Templates for lightweight and in-depth incident analysis
- Patterns and anti-patterns for post-event retrospectives
- Techniques for holding civil discussions about failure and improvement
- Techniques for improving system reliability by learning from success
Topics include:
- Incident and root cause analysis
- Technical writing
- Agile retrospectives
- Failure recovery
- Monitoring and alerting
- Event response
Connie-Lynne Villani, Grilled Cheese Invitational
With degrees in both Electrical Engineering and Theater Management, Connie-Lynne brings 20 years of System Engineering experience to the table, as well as a keen understanding of how to handle drama in the workplace. In addition to founding and managing Groupon's first SRE team, Connie-Lynne has worked at Linden Lab, Change.org, and Caltech, but admits that her most fun position is serving as a board member for the Grilled Cheese Invitational, an annual food festival celebrating all things cheesy.
author = {Connie-Lynne Villani},
title = {Living in a {Post-Post-Mortem} World: Techniques for Incident Analysis},
year = {2016},
address = {Boston, MA},
publisher = {USENIX Association},
month = dec
}