Mikolaj Pawlikowski and Sachin Kamboj, Bloomberg
We were some of the earliest adopters of Chaos Engineering (especially in the financial industry) as a tool for SRE teams to increase their systems' reliability. We were also lucky enough to contribute to the ecosystem and watch it grow.
This talk will outline what we learned, what worked, and what didn't during the past five years we were practicing Chaos Engineering.
Miko Pawlikowski is an Engineering Team Leader at Bloomberg, author of "Chaos Engineering: Site Reliability Through Controlled Disruption" and speaker. He maintains open source projects like PowerfulSeal, Goldpinger, and Syscall Monkey, which let you implement Chaos Engineering, monitor Kubernetes cluster connectivity, and intercept and modify syscalls, respectively.
Sachin Kamboj is a senior software engineer at Bloomberg, where he is part of the team that's designing and building Bloomberg's on-prem next-generation Platform as a Service platform based upon Kubernetes. He has been using Kubernetes in production since 2016 and has presented at KubeCon and has been nominated for best paper awards twice. Before joining Bloomberg, Sachin was an academic working on distributed systems and multi-agent systems and was the principal software architect behind the University of Delaware's vehicle-to-grid project that lead to a successful startup. Sachin loves breaking things to try to understand how they really work and tries to question why things are built and work the way they do. He is a strong proponent of chaos engineering and has used it successfully to make systems that are more robust and resilient to failures. When not breaking things, he spends his time playing with his two kids and enjoys hiking and ultimate frisbee.
SREcon21 Open Access Sponsored by Indeed