Characterizing and Understanding Phases of SRE Practices

Friday, June 08, 2018 - 9:00 am9:55 am

Kurt Andersen, LinkedIn

Abstract: 

Site Reliability is a journey, not a destination...

Participants in the site reliability field come from varied backgrounds and companies with varying levels of implementation of SRE principles and practices. There are no hard boundaries on this journey, but using a phased model of skill acquisition, useful signposts along the way can be discovered to help the traveler.

Using a selection of exemplar values and practices for detailed examination and then extrapolating to a wider set of other practices, I'll explore some of the landmarks that can characterize the approaches used by SRE teams. This can help participants to evaluate where they and their company are operating along the spectrum of practice and can be helpful when looking toward and planning for the next turns in the journey.

Some of the practice areas that I'll cover include incident prevention and handling, postmortems, KPI/SLOs, monitoring, and capacity management.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {214923,
author = {Kurt Andersen},
title = {Characterizing and Understanding Phases of {SRE} Practices},
year = {2018},
publisher = {{USENIX} Association},
}