Your System Has Recovered from an Incident, but Have Your Developers?

Thursday, March 29, 2018 - 2:50 pm3:10 pm

Jaime Woo


Mistakes are inevitable, and happen to the best of us. Our industry adopts a blame-free culture, but that doesn't negate the sting that occurs when we're at the heart of a mess-up.

Developers continually raise the bar on how to prevent errors, mitigate damage for ones that arise, and wring out as many learnings as possible after the damage is done. But much of this work is focused on the products, and not the people. And given the high-stakes in SRE, the range of how a mistake psychologically impacts people can run the gamut from minor to the near-traumatic.

Where are the game day exercises that simulate how to support a coworker who just caused 3 am pings and 20 hour work days? What resources should we share to help people understand the stages of emotions they'll feel after a major incident?

The concept of psychological safety is well understood as a key predictor for high-performing teams, but what does that entail? Drawing from our work at Shopify, and lessons from fields like sports, medicine, and even theatre, attendees will leave with a series of tangible actions and exercises to help restore team trust and rebuild a developer's confidence.

Jaime Woo[node:field-speakers-institution]

Jaime Woo's first job was growing bacteria in a lab to help build artificial membranes. A former journalist who led technology communications at Shopify, he worked closely with production engineering to help build team culture and collaboration through internal and external communications. He lives in Toronto and has a dog named Taco.

SREcon18 Americas Open Access Videos Sponsored by

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {213122,
author = {Jaime Woo},
title = {Your System Has Recovered from an Incident, but Have Your Developers?},
year = {2018},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar

Presentation Video