Smarter Disasters: End-to-End Automation for Incidents

Thursday, June 07, 2018 - 11:25 am11:50 am

Karthik Nilakant, Xero

Abstract: 

In this talk, I will discuss the different aspects of incident management and how we've built automation around each part at Xero. This includes: transforming manual alerts into automatic notifications through an issue report pipeline; a chat bot that streamlines incident coordination by facilitating effective communication and providing guidance through the process; and how we extract data from each incident for postmortem review. I'll also discuss how our tools have evolved and the lessons we learned on the way.

Karthik Nilakant, Xero

Karthik is a Senior Site Reliability Engineer at Xero. He's been based in the Auckland (New Zealand) office since 2016. Previously, he worked as a computer systems researcher and an enterprise server infrastructure consultant, both in New Zealand and the United Kingdom.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@conference {214985,
author = {Karthik Nilakant},
title = {Smarter Disasters: {End-to-End} Automation for Incidents},
year = {2018},
publisher = {USENIX Association},
month = jun
}

Presentation Video 

Presentation Audio