Move Fast and Learn Things: Principles of Cognition, Teaming, and Coordination to Support High Performance and Resilient Site Reliability Engineering

Wednesday, December 07, 2022 - 10:40 am11:40 am AEDT

Dr. Laura Maguire and Nora Jones, Jeli

Abstract: 

What if we designed work systems to support SRE's in building and maintaining large-scale distributed systems better?

With software systems running at speed and scale and talent scarce, it is increasingly critical to design work systems that support an SRE's ability to recognize anomalies, adapt to changing conditions, and effectively coordinate across inter and intra-organizational boundaries.

In this talk, you'll learn about the cognitive and coordinative mechanisms that underlie resilient software engineering. Drawing from engineering psychology, design thinking, cognitive systems engineering, and contemporary management theory, this talk will serve as a primer for better understanding yourself, members of your team, and organizational life in general. Using practical case study examples and engaging stories from 5 years of studying software engineers at work while they build, maintain, and repair large-scale distributed systems, this session will enlighten and inspire your SRE practice.

Laura Maguire, Jeli

Laura leads the research program at Jeli.io. She has a Master's degree in Human Factors & Systems Safety and a PhD in Cognitive Systems Engineering. Her doctoral work focused on distributed incident response practices in DevOps teams responsible for critical digital services. She was a researcher with the SNAFU Catchers Consortium from 2017-2020 and her research interests lie in resilience engineering, coordination design and enabling adaptive capacity across distributed work teams.

Nora Jones, Jeli

Nora Jones is the founder and CEO of Jeli. She is a software engineer and leader with 10+ years of experience at innovative companies including Netflix and Slack. Nora's focus on the sociotechnical aspects of engineering — the intersection between how people and software work together in practice in distributed systems — is a founding pillar of Jeli, as well as the Chaos Engineering movement and learningfromincidents.io community, both of which Nora helped build from the start.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {284877,
author = {Laura Maguire and Nora Jones},
title = {Move Fast and Learn Things: Principles of Cognition, Teaming, and Coordination to Support High Performance and Resilient Site Reliability Engineering},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec
}

Presentation Video