Principled Performance Analytics

Wednesday, March 16, 2022 - 10:50 am–12:10 pm

Narayan Desai and Brent Bryan, Google


This talk presents an exciting analytical method that is successfully delivering high fidelity insights useful in analyzing and diagnosing distributed systems. It has been used in production in a variety of complex services at scale (up to 1.4T events/day), where traditional methods have failed, with good results. We will sketch out the problem domain in detail, present the statistical methods used, as well as the intuition behind the approach.

Attendees will gain an alternative lens through which they can analyze performance, as well as an understanding of pitfalls.

Narayan Desai, Google

Narayan is an SRE at Google Cloud, where he is responsible for the reliability of GCP Data Analytics products. He has a checkered past, having worked on scheduling, configuration management, supercomputers, and metagenomics—always in the context of production systems.

Brent Bryan, Google

Brent is an SRE at Google Cloud focused on developing statistical and ML approaches to monitor service reliability. Prior to GCP SRE, Brent worked on ads optimization, serving, and measurement, as well as founding Google Domains.

SREcon22 Americas Open Access Sponsored by Blameless

@conference {278158,
author = {Narayan Desai and Brent Bryan},
title = {Principled Performance Analytics},
year = {2022},
address = {San Francisco, CA},
publisher = {USENIX Association},
month = mar

Presentation Video