How to Speed Up an Old Service

Due to the evolving Coronavirus/COVID-19 situation, SREcon20 Americas West has been rescheduled to June 2–4, 2020.
More information is available here.

Thursday, March 26, 2020 - 10:20 am11:00 am

Danilo Carvalho, Google


How does one goes around speeding up a system that has been around for several years and is not well understood? In this talk I'll go over some of the reasons why improving performance of these systems might be necessary and high level lessons learned while dealing with one such system - a strongly consistent distributed storage file system.

On the need of improving performance, I outline 3 reasons: cost, capacity and complexity. While the first two are straightforward, the fact that improved performance often decreases the complexity of services—by removing caching, co-location restrictions, request hedging—is often surprising. I'll illustrate this point with a case where improving latency on the core service allowed us to remove an entire caching layer, deleting thousands of lines of code.

@conference {247255,
author = {Danilo Carvalho},
title = {How to Speed Up an Old Service},
year = {2020},
address = {Santa Clara, CA},
publisher = {{USENIX} Association},
month = mar,