sponsors
usenix conference policies
Disaster Preparedness
Moderator: Bethanye Blount, Facebook
Panelists: Kripa Krishnan, Google; Richard Waid, LinkedIn; Mat Schaffer, Netflix
While a good SRE team is proactive during the design phase many things can affect the reliability of the service. A critical aspect of planning is practicing what happens in the worst-case scenario and if the product and team can respond and recover quickly. Many teams do continuous testing, spot-checking, or large scale annual tests. This panel will discuss the how to do this safely, justify to the company, and the followup that is required to really leverage the advantages of these tests.
Richard Waid joined LinkedIn in 2012, moving to Mountain View from a diverse background in Software Engineering and Operations in New Zealand. As a Staff SRE at LinkedIn, Richard faces the daily paradox of SRE everywhere: figuring out what went wrong while making sure that it never happens again. His responsibilities at LinkedIn include leading the SRE team responsible for the LinkedIn profile and engagement.
author = {Kripa Krishnan and Bethanye Blount and Richard Waid and Mat Schaffer},
title = {Disaster Preparedness},
year = {2014},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = may
}
connect with us