Chaos Engineering at Scale

Thursday, December 08, 2022 - 9:00 am–10:00 am AEDT

Sharath Reddy and Venkatesh Maligireddy, PayPal

Abstract: 

As an SRE or an application owner, it is common to come across the below questions/scenarios during the day-to-day activities of an engineer:

  • "If only we had seen this sooner…" during the course of a SITE incident.
  • "What happens if one of my service dependencies fails?"
  • "How reliable my application is in the production environment?"

Chaos engineering has evolved into a must-to-have SRE culture that addresses the above questions and thereby improves the resiliency of internal systems that gives the teams confidence and a path to provide best-in-class products at scale.

In this talk, we will cover

  1. The Chaos principles
  2. How to prepare for Chaos journey in an organization
  3. How to conduct Chaos Gamedays
  4. How to Measure and Track the resiliency of a system
  5. Leverage existing opensource Chaos platforms

Sharath Reddy, PayPal

Sharath is an Engineer with 10 years of experience in Software. Worked in product development as well as Site Reliability in large Enterprises as well as a couple of startups. Have a strong passion for working on Complex engineering problems, which generally keeps him going. He has a Penchant for the Elegant Design of systems. Apart from this, he follows and sometimes plays cricket & Soccer.

Venkatesh Maligireddy, PayPal

Venkatesh Maligireddy is a Senior Software Engineer at PayPal where he works on building the enterprise Chaos platform. In his prior roles, he lead a team built a ChatOps platform that helps automate Incident management and Operational efficiency workflows at PayPal. In his role as an SRE, he has also worked on Disaster Recovery and Parity measurement platforms across Data Centers.

BibTeX
@conference {284901,
author = {Sharath Reddy and Venkatesh Maligireddy},
title = {Chaos Engineering at Scale},
year = {2022},
address = {Sydney},
publisher = {USENIX Association},
month = dec
}

Presentation Video