Tired Reacting to Certificate Outages? Build Certificate Resilient Distributed Systems Using Chaos Engineering Practices

Thursday, March 23, 2023 - 2:45 pm3:30 pm

Kaitlyn Yang and Vikram Raju, Microsoft

Abstract: 

Certificate related disruptions and outages are very costly causing loss of customer trust, negative media coverage, impact to revenue and employee burn out. Preventing certificate related outages is hard due to lack of framework, processes and tools that can scale with ever changing complex distributed system. In this talk, we will go over real-world certificate failure scenarios that are hard to continuously validate in a seamless manner. We will deep dive on how we leveraged Chaos engineering practices and scaled our solution. We are hoping the audience will walk away on how to hunt, detect, continuously measure and shift left certificate resiliency of services in a scalable fashion.

Kaitlyn Yang, Microsoft

Kaitlyn is a Software Engineer at Microsoft where she works on building platforms that powers protection of Microsoft and its customers data. Kaitlyn also leads efforts to improve the security and reliability resiliency of Azure by enhancing Microsoft Chaos Studio product

Vikram Raju, Microsoft

Vikram is a Product Manager at Microsoft working on Azure Chaos Studio. Azure Chaos Studio is a fully-managed service that helps users measure, understand, and build application and service resilience to real world outages.

BibTeX
@conference {286252,
author = {Kaitlyn Yang and Vikram Raju},
title = {Tired Reacting to Certificate Outages? Build Certificate Resilient Distributed Systems Using Chaos Engineering Practices},
year = {2023},
address = {Santa Clara, CA},
publisher = {USENIX Association},
month = mar
}