Will Gallego, Etsy
SRE’s are frequently tasked with being front and center in intense, highly demanding situations in the production environment that require clear lines of communication. Our systems fail not because of a lack of attention or laziness but due to cognitive dissonance between what we believe about our environments and the objective interactions both internal and external to them. In this talk, I’ll discuss how we can revisit our established beliefs surrounding failure scenarios with an emphasis not on the who in decision making but the why behind those decisions. With this mindset, we can encourage our teams to reject shallow explanations of human error for said failures, instead focusing on how we can gain greater understanding of these complexities. I’ll walk through the structure of post mortems used at large tech companies with real world examples of failure scenarios and debunk myths regularly attributed to failures. Through these discussions, you'll learn how to incorporate open dialogue within and between teams to bridge these gaps in understanding.
Will Gallego is a systems engineer with 15+ years of experience in the web development field, currently as a Staff Engineer at Etsy. Comfortable with several parts of the stack, he focuses now on building scalable, distributed backend systems and tools to help engineers grow. He believes in a free and open internet, blame aware post mortems, and pronouncing gif with a soft “G”.
SREcon18 Americas Open Access Videos Sponsored by
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.