Zeta: A Scalable and Robust East-West Communication Framework in Large-Scale Clouds

Website Maintenance Alert

Due to scheduled maintenance, the USENIX website will not be available on Saturday, April 13, from 12:00 am–12:30 am Pacific Daylight Time (UTC-7). We apologize for the inconvenience.

If you are trying to register for NSDI '24 or register for PEPR '24, please complete your registration before or after this time period.

Authors: 

Qianyu Zhang, Gongming Zhao, and Hongli Xu, University of Science and Technology of China; Zhuolong Yu, Johns Hopkins University; Liguang Xie, Futurewei Technologies; Yangming Zhao, University of Science and Technology of China; Chunming Qiao, SUNY at Buffalo; Ying Xiong, Futurewei Technologies; Liusheng Huang, University of Science and Technology of China

Abstract: 

With the broad deployment of distributed applications on clouds, the dominant volume of traffic in cloud networks traverses in an east-west direction, flowing from server to server within a data center. Existing communication solutions are tightly coupled with either the control plane (e.g., preprogrammed model) or the location of compute nodes (e.g., conventional gateway model). The tight coupling makes it challenging to adapt to rapid network expansion, respond to network anomalies (e.g., burst traffic and device failures), and maintain low latency for east-west traffic.

To address this issue, we design Zeta, a scalable and robust east-west communication framework with gateway clusters in large-scale clouds. Zeta abstracts the traffic forwarding capability as a Gateway Cluster Layer, decoupled from the logic of control plane and the location of compute nodes. Specifically, Zeta adopts gateway clusters to support large-scale networks and cope with burst traffic. Moreover, a transparent Multi IPs Migration is proposed to quickly recover the system/devices from unpredictable failures. We implement Zeta based on eXpress Data Path (XDP) and evaluate its scalability and robustness through comprehensive experiments with up to 100k container instances. Our evaluation shows that Zeta reduces the 99% RTT by 5.1 × in burst video traffic, and speeds up the gateway recovery by 10.8 × compared with the state-of-the-art solutions.

NSDI '22 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {278310,
author = {Qianyu Zhang and Gongming Zhao and Hongli Xu and Zhuolong Yu and Liguang Xie and Yangming Zhao and Chunming Qiao and Ying Xiong and Liusheng Huang},
title = {Zeta: A Scalable and Robust {East-West} Communication Framework in {Large-Scale} Clouds},
booktitle = {19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)},
year = {2022},
isbn = {978-1-939133-27-4},
address = {Renton, WA},
pages = {1231--1248},
url = {https://www.usenix.org/conference/nsdi22/presentation/zhang-qianyu},
publisher = {USENIX Association},
month = apr
}

Presentation Video