Kurt Andersen, LinkedIn
Site Reliability is a journey, not a destination...
Participants in the site reliability field come from varied backgrounds and companies with varying levels of implementation of SRE principles and practices. There are no hard boundaries on this journey, but using a phased model of skill acquisition, useful signposts along the way can be discovered to help the traveler.
Using a selection of exemplar values and practices for detailed examination and then extrapolating to a wider set of other practices, I'll explore some of the landmarks that can characterize the approaches used by SRE teams. This can help participants to evaluate where they and their company are operating along the spectrum of practice and can be helpful when looking toward and planning for the next turns in the journey.
Some of the practice areas that I'll cover include incident prevention and handling, postmortems, KPI/SLOs, monitoring, and capacity management.
Kurt Andersen, LinkedIn
Kurt Andersen is one of the co-chairs for SREcon18Americas and has been active in the anti-abuse community for over 15 years. He is currently the senior IC for the Product SRE (site reliability engineering) team at LinkedIn. He also works as one of the Program Committee Chairs for the Messaging, Malware, and Mobile Anti-Abuse Working Group (M3AAWG.org). He has spoken at M3AAWG, Velocity, SREcon, and SANOG on various aspects of reliability, authentication, and security.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Kurt Andersen},
title = {Characterizing and Understanding Phases of {SRE} Practices},
year = {2018},
publisher = {USENIX Association},
month = jun
}