How to Trade off Server Utilization and Tail Latency

Friday, June 14, 2019 - 10:00 am10:30 am

Julius Plenz, Google


When running large scale systems, we strive to deliver both low tail latency and high utilization of servers. However, these two dimenions are at odds: increasing the average utilization of a system will have a detrimental impact on the tail latency.

This talk provides a light-weight walkthrough of the important basics of queueing theory (avoiding unnecessary formalism), illustrates graphically several typical outcomes of this analysis, and closes with a few basic rules on how to think about utilization and tail latency.

Julius Plenz, Google

Julius studied Math in Berlin and has been with Google in Sydney for four years, where he’s worked mostly on low-latency distributed storage systems.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

@conference {233287,
author = {Julius Plenz},
title = {How to Trade off Server Utilization and Tail Latency},
year = {2019},
address = {Singapore},
publisher = {USENIX Association},
month = jun

Presentation Video