Unified Reporting of Service Reliability

Thursday, June 13, 2019 - 5:30 pm6:00 pm

Helen Zhang, Google

Abstract: 

We built a unified reporting system to bring together data from different sources that lived in unconnected silos (such as SLO reporting metrics, postmortems, incident response tools, customer support tickets, etc.). The system ingests and correlates data from these different sources and stores the processed data in a new database. People from a variety of teams would use the data to create customized dashboards that suit their particular reporting needs.

Helen Zhang, Google

Helen Zhang is a staff software engineer at Google SRE. During her nine years with Google, she has worked with hundreds of developers across the company to launch mission-critical production services. She recently led a team to build a unified service reporting system for service reliability.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {233331,
author = {Helen Zhang},
title = {Unified Reporting of Service Reliability},
year = {2019},
address = {Singapore},
publisher = {USENIX Association},
month = jun
}

Presentation Video