Re-architecting Congestion Management in Lossless Ethernet

Authors: 

Wenxue Cheng and Kun Qian, Tsinghua University and Beijing National Research Center for Information Science and Technology (BNRist); Wanchun Jiang, Central South University; Tong Zhang, Tsinghua University, Beijing National Research Center for Information Science and Technology (BNRist), and Nanjing University of Aeronautics and Astronautics; Fengyuan Ren, Tsinghua University and Beijing National Research Center for Information Science and Technology (BNRist)

Abstract: 

The lossless Ethernet is attractive for data centers and cluster systems, but various performance issues, such as unfairness, head-of-line blocking and congestion spreading, etc., impede its large-scale deployment in production systems. Through fine-grained experimental observations, we inspect the interactions between flow control and congestion control, and are aware that the radical cause of performance problems is the ineffective elements in the congestion management architecture for lossless Ethernet, including the improper congestion detection mechanism and inadequate rate adjustment law.

Inspired by these insights and findings obtained in experiment investigations, we revise the congestion management architecture, and propose the Photonic Congestion Notification (PCN) scheme, which consists of two basic components: (i) a novel congestion detection and identification mechanism to recognize which flows are really responsible for congestion; (ii) a receiver-driven rate adjustment method to alleviate congestion in as short as 1 RTT. We implement PCN using DPDK NICs and conduct evaluations using testbed experiments and simulations. The results show that PCN greatly improves performance under concurrent burst workload, and significantly mitigates PFC PAUSE messages and reduces the flow completion time under realistic workload.

NSDI '20 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {246492,
author = {Wenxue Cheng and Kun Qian and Wanchun Jiang and Tong Zhang and Fengyuan Ren},
title = {Re-architecting Congestion Management in Lossless Ethernet},
booktitle = {17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20)},
year = {2020},
isbn = {978-1-939133-13-7},
address = {Santa Clara, CA},
pages = {19--36},
url = {https://www.usenix.org/conference/nsdi20/presentation/cheng},
publisher = {USENIX Association},
month = feb
}

Presentation Video