Clint Byrum, HashiCorp, an IBM Company
Working on complex systems and expecting to be right are not always compatible things. PostgreSQL is an incredibly complex component of complex systems, and how we use it is just as important as how it works internally.
We have spent several years operating a very busy AWS Aurora PostgreSQL database with varying levels of reliability, and we got pretty good at finding ways to break it with MultiXacts. This meant incidents, experiments, and learning to get a little less wrong with each iteration. Join us to learn about two very important things:
- Several, decreasingly wrong ideas about how MultiXact locks work in PostgreSQL with data and analysis from real incidents.
- How to persevere through the stress of incidents and keep learning in a complex system despite knowing that you're at least a little wrong, all the time.

Clint Byrum is a Staff SRE at IBM, working on the reliability and performance of HashiCorp Terraform, IBM's cloud offering for running Terraform. Clint has decades of experience in operations, open source, and software engineering, including working as a core developer on Ubuntu and OpenStack. More recently Clint has been a full-time reliability engineer, leading efforts to stabilize and scale systems inside GoDaddy, Spotify, and HashiCorp, with a particular focus on Resilience and Learning from Incidents. Clint co-hosts a podcast about Resilience in Software called "This is Fine!" with Colette Alexander.

author = {Clint Byrum},
title = {5 Wrong Hypotheses about {PostgreSQL} {Multi-Transaction} Locks},
year = {2026},
address = {Seattle, WA},
publisher = {USENIX Association},
month = mar
}
