The Math behind the Incident Aftermath: A Practical Guide to Measuring Incident Impacts

Wednesday, December 07, 2022 - 10:40 am11:40 am AEDT

Ashish Patel and Sriram Srinivasan, PayPal

Abstract: 

Despite having world class reliability systems, some incidents do occur making varying levels of impact to our business. One of the many steps involved in the aftermath of the incident is measuring its impact.

Accurate measurement of financial impact due to an incident is an important part of incident management and is needed in real-time for many reasons including regulatory requirements.

However, it's not easy to calculate the impact given the dynamics of the complex distributed system. As SRE is deep-rooted in automation, we have built an Incident Impact Calculation Framework that accurately measures the incident impact using various Statistical and ML Models. It runs in seconds, is completely automated and fits well with our other incident management tools.

Building a real-time, reliable, and robust impact assessment framework is not as straightforward as it seems. Join us in this session to know more about impact assessment, its design, and challenges.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {284875,
author = {Ashish Patel and Sriram Srinivasan},
title = {The Math behind the Incident Aftermath: A Practical Guide to Measuring Incident Impacts},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec
}

Presentation Video