Your System Has Recovered from an Incident, but Have Your Developers?

Wednesday, 29 August, 2018 - 16:0016:45

Jaime Woo, DigitalOcean

Abstract: 

Mistakes are inevitable, and happen to the best of us. Our industry adopts a blame-free culture, but that doesn't negate the sting that occurs when we're at the heart of a mess-up.

Developers continually raise the bar on how to prevent errors, mitigate damage for ones that arise, and wring out as many learnings as possible after the damage is done. But much of this work is focused on the products, and not the people. And given the high-stakes in SRE, the range of how a mistake psychologically impacts people can run the gamut from minor to the near-traumatic.

Where are the game day exercises that simulate how to support a coworker who just caused 3 am pings and 20 hour work days? What resources should we share to help people understand the stages of emotions they'll feel after a major incident?

The concept of psychological safety is well understood as a key predictor for high-performing teams, but what does that entail? Drawing from original research, and lessons from fields like sports, medicine, and even stand-up comedy, attendees will leave with a series of tangible actions and exercises to help restore team trust and rebuild a developer's confidence.

Jaime Woo, DigitalOcean

Jaime Woo started his career as a molecular biologist, working on cartilage replacements. While he adored nurturing genetically-modified E. coli, he realized his main passion was storytelling. He has written an award-nominated book, launched the Engineering blog at Riot, built the technology communications team at Shopify, and currently shepherds content at DigitalOcean. He has a dog named Taco that he will absolutely show you pictures of.

SREcon18 Europe/Middle East/Africa Open Access Videos
Sponsored by Indeed

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {218909,
author = {Jaime Woo},
title = {Your System Has Recovered from an Incident, but Have Your Developers?},
booktitle = {SREcon18 Europe/Middle East/Africa (SREcon18 Europe)},
year = {2018},
address = {Dusseldorf},
url = {https://www.usenix.org/node/218910},
publisher = {USENIX Association},
month = aug
}

Presentation Video 

Presentation Audio