Reliability Is Performance

Due to the evolving Coronavirus/COVID-19 situation, SREcon20 Americas West has been rescheduled to June 2–4, 2020.
More information is available here.

Wednesday, March 25, 2020 - 2:30 pm3:05 pm

Bradley Shively, Uber ATG


"Make it faster!" is a common refrain from managers and customers. As teams that build and run systems, we are regularly under a lot of pressure to provide better performance. This may mean cutting corners to deliver more cores, bandwidth, or other resources as quickly as possible. However, we're often making an unseen trade-off; we sacrifice considerable amounts of reliability in pursuit of marginal performance gains.

Stated simply, a "faster" process that fails more often may not actually be faster at all.

In this talk, I'll argue that reliability is one of the best investments you can make to improve system performance. We'll explore the way in which even small reductions in reliability can translate into significantly worse average performance and increased costs. We'll consider these implications through the lens of expected value.

Brad Shively is an engineering manager at the Uber Advanced Technologies Group, where he leads a large part of the Developer Experience team. Prior to Uber, he worked in business operations at Google and spent time as a management consultant. He's passionate about building both engineering teams and services that are durable and high-performance.

