Sarah Butt, Salesforce
In many ways, incident management is the "emergency room" for technical systems. As technology has evolved, it has progressed from auxiliary systems, to essential business systems of record, to critical systems of engagement across multiple industries. As these systems become increasingly critical, SRE's role in incident management and resolution has become vital for any essential technical system. This talk focuses on how various strategies used in the medical field can be applied to incident response. From looking at algorithm guided decisions (and learning a bit about what "code blue" really means) to discussing approaches to triage and stabilization based on the ATLS protocol, to considering the role of response standardization such as surgical checklists in reducing cognitive overhead (especially when PagerDuty goes off at 2 a.m.!), this talk aims to take key learnings from the medical field and apply it in practical ways to incident management and response. This talk is largely conceptual in nature, with takeaways for attendees from a wide variety of backgrounds and technical experience levels.
Sarah is a former audio engineer turned technology professional who has spent the past 6 years of her career at Salesforce and Dell devoted to customer-perceived reliability. She is a 2021 MBA graduate from The University of Texas (Hook'em!) where she did graduate work studying the intersection of technology, business, and people in the context of SRE. A few of her favorite topics include user-centric monitoring, intelligent alerting, and using innovative technology to drive high availability of complex distributed systems. Sarah is currently part of Salesforce's SRE organization, where you'll likely find her talking about topics such as resilience, observability, and incident management and response. In her free time, you'll often find her hiking in the Texas Hill Country with Rosie, her yellow lab.
SREcon21 Open Access Sponsored by Indeed