Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows

Authors: 

Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, and Desislava Dimitrova, ETH Zurich; Matthew Forshaw, Newcastle University; Timothy Roscoe, ETH Zurich

Abstract: 

Streaming computations are by nature long-running, and their workloads can change in unpredictable ways. This in turn means that maintaining performance may require dynamically scaling allocated computational resources. Some modern large-scale stream processors allow dynamic scaling but typically leave the difficult task of deciding how much to scale to the user. The process is cumbersome, slow and often inefficient. Where automatic scaling is supported, policies rely on coarse-grained metrics like observed throughput, backpressure, and CPU utilization. As a result, they tend to show incorrect provisioning, oscillations, and long convergence times. We present DS2, an automatic scaling controller for such systems which combines a general performance model of streaming dataflows with lightweight instrumentation to estimate the true processing and output rates of individual dataflow operators. We apply DS2 on Apache Flink and Timely Dataflow and demonstrate its accuracy and fast convergence. When compared to Dhalion, the state-of-the-art technique in Heron, DS2 converges to the optimal, backpressure-free configuration in a single step instead of six.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

Presentation Audio

BibTeX
@inproceedings {222593,
author = {Vasiliki Kalavri and John Liagouris and Moritz Hoffmann and Desislava Dimitrova and Matthew Forshaw and Timothy Roscoe},
title = {Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows},
booktitle = {13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18)},
year = {2018},
isbn = {978-1-931971-47-8},
address = {Carlsbad, CA},
pages = {783--798},
url = {https://www.usenix.org/conference/osdi18/presentation/kalavri},
publisher = {{USENIX} Association},
}