Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics

Authors: 

Shuhao Liu, Li Chen, and Baochun Li, University of Toronto

Abstract: 

It is increasingly common that large volumes of production data originate from geographically distributed datacenters. Processing such datasets with existing data parallel frameworks may suffer from significant slowdowns due to the much lower availability of inter-datacenter bandwidth. Thus, it is critical to optimize the delivery of inter-datacenter traffic, especially coflows that imply application-level semantics, to improve the performance of such geo-distributed applications.

In this paper, we present Siphon, a building block integrated in existing data parallel frameworks (e.g., Apache Spark) to expedite their generated inter-datacenter coflows at runtime. Specifically, Siphon serves as a transport service that accelerates and schedules the inter-datacenter traffic with the awareness of workload-level dependencies and performance, while being completely transparent to analytics applications. Novel intra-coflow and inter-coflow scheduling and routing strategies have been designed and implemented in Siphon, based on a software-defined networking architecture.

On our cloud-based testbeds, we have extensively evaluated Siphon's performance in accelerating coflows generated by a broad range of workloads. With a variety of Spark jobs, Siphon can reduce the completion time of a single coflow by up to 76%. With respect to the average coflow completion time, Siphon outperforms the state-of-the-art scheme by 10%.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {216021,
author = {Shuhao Liu and Li Chen and Baochun Li},
title = {Siphon: Expediting {Inter-Datacenter} Coflows in {Wide-Area} Data Analytics},
booktitle = {2018 USENIX Annual Technical Conference (USENIX ATC 18)},
year = {2018},
isbn = {978-1-939133-01-4},
address = {Boston, MA},
pages = {507--518},
url = {https://www.usenix.org/conference/atc18/presentation/liu},
publisher = {USENIX Association},
month = jul
}

Presentation Audio