Principles of Safety and Reliability Learned from US Navy Landing Signal Officers

Wednesday, December 07, 2022 - 1:30 pm–2:30 pm AEDT

Matthew Brahms, OpsLevel

Abstract: 

Recovering fighter aircraft aboard an aircraft carrier is an extremely complicated and dangerous process—even under the best conditions. Performing a successful arrestment aboard the boat has specific parameters, process, and problems that must be overcome and adhered to.

While this topic is seemingly unrelated to software engineering, many similarities and principles can be learned from how the role of Landing Signal Officer is performed. As a community, they have crafted and honed their role at great cost to safely and expeditiously recover aircraft. These similarities and principles have implications to both SRE practitioners and the field of SRE as a whole.

Matthew Brahms, OpsLevel

As a Site Reliability Engineer, Matthew works to build scalable/resilient systems and instill SRE culture into the teams he embeds with (SLI,SLO,SLA anyone?!). Previous roles have included DevOps Engineer, Systems Administrator, and being a professional Classical musician.

Originally from Columbus, OH, Matthew holds degrees from The Ohio State University and Carnegie Mellon University in Pittsburgh. Currently he lives in Austin, TX, and enjoys working with Kubernetes, Go, and other cloud native technologies.

Other favorite activities include spending time with his family; training for a marathon; eating a whole-food, plant-based diet; and talking/listening to all things Classical in music.

BibTeX
@conference {284885,
author = {Matthew Brahms},
title = {Principles of Safety and Reliability Learned from {US} Navy Landing Signal Officers},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec
}

Presentation Video