What Breaks Our Systems: A Taxonomy of Black Swans

Monday, March 25, 2019 - 9:00 am9:30 am

Laura Nolan

Abstract: 

Black swan events: unforeseen, unanticipated, and catastrophic issues. These are the incidents that take our systems down, hard, and keep them down for a long time.

By definition, you cannot predict true black swans. But black swans often fall into certain categories that we've seen before. This talk examines those categories and how we can harden our systems against these categories of events, which include unforeseen hard capacity limits, cascading failures, hidden system dependencies, and more.

Laura Nolan[node:field-speakers-institution]

Laura Nolan's background is in Site Reliability Engineering, software engineering, distributed systems, and computer science. She wrote the 'Managing Critical State' chapter in the O'Reilly 'Site Reliability Engineering' book, as well as contributing to the more recent 'Seeking SRE'. Laura has been in the software industry for 15 years, most recently as a Staff SRE at Google.

BibTeX
@conference {229525,
author = {Laura Nolan},
title = {What Breaks Our Systems: A Taxonomy of Black Swans},
year = {2019},
address = {Brooklyn, NY},
publisher = {{USENIX} Association},
}