Real World SLOs and SLIs: A Deep Dive

Thursday, 30 August, 2018 - 16:4517:30

Matthew Flaming and Elisa Binette, New Relic

Abstract: 

If you've read almost anything about SRE best practices, you've probably come across the idea that clearly defined and well-measured Service Level Objectives (SLOs) and Service Level Indicators (SLIs) are a key pillar of any reliability program. SLOs allow organizations and teams to make smart, data-driven decisions about risk and the right balance of investment between reliability and product velocity.

But in the real world, SLOs and SLIs can be challenging to define and implement. In this talk, we’ll dive into the nitty-gritty of how to define SLOs that support different reliability strategies and modalities of service failure. We’ll start by looking at key questions to consider when defining what “reliability” means for your organization and platform. Then we'll dig into how those choices translate into specific SLI/SLO measurement strategies in the context of different architectures (for example, hard-sharded vs. stateless random-workload systems) and availability goals.

Matthew Flaming, New Relic

Matthew Flaming began his career in software engineering back when creating a web portal meant hacking together your own version of JSP and racking your own Solaris boxes. Since then he has led the development of complex, high-scale backend systems ranging from CDNs to IoT platforms with an equal emphasis on technical architecture and building organizations where innovation thrives. In his current role as VP of Site Reliability at New Relic, he focuses on the SRE practice and the technical, operational, and cultural aspects of scaling and reliability.

Elisa Binette, New Relic, Inc.

Elisa Binette is a Senior Engineering Manager within the Site Reliability Organization at New Relic. The group focuses on helping teams measure and achieve their reliability goals, improving reliability for both the engineers within the company and for the end customers of New Relic. She’s actively involved with PDXWIT, a local non-profit whose purpose is to strengthen the Portland women in tech community. She also loves martial arts, and has enjoyed both practicing and teaching classes for many years.

SREcon18 Europe/Middle East/Africa Open Access Videos
Sponsored by Indeed

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {218961,
author = {Matthew Flaming and Elisa Binette},
title = {Real World {SLOs} and {SLIs}: A Deep Dive},
booktitle = {SREcon18 Europe/Middle East/Africa (SREcon18 Europe)},
year = {2018},
address = {Dusseldorf},
url = {https://www.usenix.org/node/218962},
publisher = {USENIX Association},
month = aug
}

Presentation Video 

Presentation Audio