Know Thy Enemy: How to Prioritize and Communicate Risks

Thursday, March 29, 2018 - 2:05 pm2:45 pm

Matt Brown, Google


Every SRE team attempting to manage, mitigate, or eliminate the risks facing their system will encounter two fundamental problems:

1. As humans our intuitive judgement about risk is unreliable.
2. The work required to address all potential risks far outstrips our available time and resources.

The CRE team (Customer Reliability Engineering—a group of Google SREs who partner with cloud customers to implement SRE practices in their application and across the cloud provider/customer relationship) battles these challenges every day in our interactions with customers. We have drawn on Google’s deep experience managing reliable systems, and the broader field of risk management techniques to develop a process that allows us to communicate an objective ranking of risks and their expected cost to a system. This ranking and the associated cost data can then be used as an input to team and business decision making.

This talk will cover the development of our process, explain how anyone can apply it to any system today and demonstrate how the resulting ranking and costs provide objective, consistent data which can take the tension and subjectivity out of often tense discussions around work priorities and focus (e.g. more features or more reliability?).

Matt Brown, Google

Matt began his SRE career with Google in Dublin in 2007, shifting to London in 2012, and since 2016 works remotely from Cambridge in New Zealand.

During this time Matt has worked on or led a range of diverse SRE teams with responsibilities ranging from Google's internal corporate infrastructure, through to the Internet facing load-balancing infrastructure responsible for keeping Google fast and always available.

His current role with the Customer Reliability Engineering team is pioneering how to apply SRE practices across organisations to address the challenges posed by today's world where the traditional boundaries between platforms and their customers are being blurred.

SREcon18 Americas Open Access Videos Sponsored by

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {213116,
author = {Matt Brown},
title = {Know Thy Enemy: How to Prioritize and Communicate Risks},
year = {2018},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar

Presentation Video