Running Excellent Retrospectives: What Happened?

Tuesday, October 30, 2018 - 9:00 am10:30 am

Courtney Eckhardt, Heroku, a Salesforce company

Abstract: 

Your site’s back up, you’re back in business. Do you have a way to make sure that problem doesn’t happen again? And if you do, do you like how it works?

Heroku uses a blameless retrospective process to understand and learn from our operational incidents. This tutorial will share the process we use and give you a chance to practice analyzing operational problems using the internal and external communications of a real Heroku operational incident. Along the way, we’ll discuss how Heroku developed this process, what issues we were trying to solve, and how we’re still iterating on it.

Courtney Eckhardt, Heroku, a Salesforce company

Courtney Eckhardt first got into retrospectives when she signed up for comp.risks as an undergrad (and since then, not as much has changed as we’d like to think). Her perspectives on engineering process improvement are strongly informed by the work of Kathy Sierra and Don Norman (among others).

BibTeX
@conference {221828,
author = {Courtney Eckhardt},
title = {Running Excellent Retrospectives: What Happened?},
year = {2018},
address = {Nashville, TN},
publisher = {USENIX Association},
month = oct
}
Who should attend: 

Engineers and engineering managers who want to bring an incident retrospective process to their org, or improve one they already have.

Take back to work: 

Attendees will have the materials and firsthand experience to advocate for (or to begin) an incident retrospective process at their workplace, or to improve a process they might already be using.

Topics include: 
  • Why run a retrospective
  • Goal of a retrospective
  • Blameless retrospectives
  • How to structure a retrospective
  • Preparing for a retrospective
  • Five “why”s / infinite “how”s
  • How to understand human error
Prerequisites: 

While this is probably not suitable for extremely junior engineers, there are no specific pre-requisites.