Executing Chaos Engineering in Production at a Critical Financial Institution

Tuesday, March 24, 2026 - 1:50 pm2:35 pm

Luiz Siqueira and Leonardo Marques, Bradesco

Discover how Chaos Engineering transformed a high-stakes financial ecosystem processing thousands of transactions per second. This real-world case study unveils a reproducible framework for risk-averse organizations, blending fault injection, automation, and observability.

Key takeaways include safe experiment design with governance guardrails, automated chaos workflows, and multidisciplinary GameDays. Results: 73% reduction in MTTD, 10 hidden vulnerabilities exposed, 5 new metrics, and a shift to proactive reliability.

Learn a compliance-friendly methodology to turn failures into insights, bridging theory and measurable business impact in critical systems. Perfect for SREs, Developers, and Ops teams seeking production-ready resilience.

Luiz Siqueira is a specialist in Information Technology with over 15 years of experience in managing critical systems and operational reliability. His background includes an MBA in Site Reliability Engineering (SRE) and a degree in IT Management. He has worked on large-scale projects in companies such as IBM, Kyndryl, and Banco Bradesco, focusing on support, automation, and digital transformation. Currently, he serves as SRE Manager at Bradesco, leading initiatives to ensure scalability, resilience, and efficiency in digital environments. He holds certifications in SRE Foundation℠, Gremlin Chaos Engineering, ITIL, Agile, and Cloud Service Management. Beyond his technical expertise, he is recognized for building high-performance teams and advancing modern reliability practices in complex environments.

Leonardo Siqueira Marques is a Senior IT Operations and Site Reliability Engineering leader with over 25 years of experience in information technology, building and operating highly available, large-scale systems in the financial sector. He holds a degree in Computer Science and an MBA in Digital Transformation from the University of São Paulo (USP). Leonardo has been actively driving reliability transformation initiatives focused on Chaos Engineering, incident response, and operational maturity, emphasizing the use of controlled experimentation and real production failures as continuous learning mechanisms to build resilient, high-performing engineering teams.

BibTeX
@conference {316250,
author = {Luiz Siqueira and Leonardo Marques},
title = {Executing Chaos Engineering in Production at a Critical Financial Institution},
year = {2026},
address = {Seattle, WA},
publisher = {USENIX Association},
month = mar
}

Presentation Video