Apache Nemo: A Framework for Building Distributed Dataflow Optimization Policies

Authors: 

Youngseok Yang and Jeongyoon Eo, Seoul National University; Geon-Woo Kim, Viva Republica; Joo Yeon Kim, Samsung Electronics; Sanha Lee, Naver Corp.; Jangho Seo, Won Wook Song, and Byung-Gon Chun, Seoul National University

Abstract: 

Optimizing scheduling and communication of distributed data processing for resource and data characteristics is crucial for achieving high performance. Existing approaches to such optimizations largely fall into two categories. First, distributed runtimes provide low-level policy interfaces to apply the optimizations, but do not ensure the maintenance of correct application semantics and thus often require significant effort to use. Second, policy interfaces that extend a high-level application programming model ensure correctness, but do not provide sufficient fine control. We describe Apache Nemo, an optimization framework for distributed dataflow processing that provides fine control for high performance, and also ensures correctness for ease of use. We combine several techniques to achieve this, including an intermediate representation, optimization passes, and runtime extensions. Our evaluation results show that Nemo enables composable and reusable optimizations that bring performance improvements on par with existing specialized runtimes tailored for a specific deployment scenario.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {234996,
author = {Youngseok Yang and Jeongyoon Eo and Geon-Woo Kim and Joo Yeon Kim and Sanha Lee and Jangho Seo and Won Wook Song and Byung-Gon Chun},
title = {Apache Nemo: A Framework for Building Distributed Dataflow Optimization Policies},
booktitle = {2019 USENIX Annual Technical Conference (USENIX ATC 19)},
year = {2019},
isbn = {978-1-939133-03-8},
address = {Renton, WA},
pages = {177--190},
url = {https://www.usenix.org/conference/atc19/presentation/yang-youngseok},
publisher = {USENIX Association},
month = jul
}

Presentation Video