When Cloud Storage Meets RDMA

Authors: 

Yixiao Gao, Nanjing University and Alibaba Group; Qiang Li, Lingbo Tang, Yongqing Xi, Pengcheng Zhang, Wenwen Peng, Bo Li, Yaohui Wu, Shaozong Liu, Lei Yan, Fei Feng, Yan Zhuang, Fan Liu, Pan Liu, Xingkui Liu, Zhongjie Wu, Junping Wu, and Zheng Cao, Alibaba Group; Chen Tian, Nanjing University; Jinbo Wu, Jiaji Zhu, Haiyong Wang, Dennis Cai, and Jiesheng Wu, Alibaba Group

Abstract: 

Pangu is a cloud storage developed by Alibaba. Since its inception in 2009, it served and is still serving most core businesses of Alibaba, e.g., e-business and online payment. A cloud storage is expected to achieve high performance, high reliability and high stability simultaneously. Recent rapid progress of storage medium makes networking a major performance bottleneck for new generations of cloud storage. Remote Direct Memory Access (RDMA) running on lossless Ethernet is the most promising answer for network bottleneck in cloud storage. In this paper, we share our experience on introducingRDMAintoPangu'sstoragenetworks. We design a fabric, taking performance, reliability and stability into consideration together. For performance optimization, Pangu builds a software framework that integrates RDMA with its private storage protocol stack. For reliability guarantee, Pangu uses RDMA/TCP switching as a final resort. For stability improvement, Pangu uses intensive monitoring and parameter tuning for fail-over. Till the submission time, RDMA-enabled Pangu has successfully served many online mission-critical services for over three years, including several important shopping festivals.

NSDI '21 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {262036,
author = {Yixiao Gao and Qiang Li and Lingbo Tang and Yongqing Xi and Pengcheng Zhang and Wenwen Peng and Bo Li and Yaohui Wu and Shaozong Liu and Lei Yan and Fei Feng and Yan Zhuang and Fan Liu and Pan Liu and Xingkui Liu and Zhongjie Wu and Junping Wu and Zheng Cao and Chen Tian and Jinbo Wu and Jiaji Zhu and Haiyong Wang and Dennis Cai and Jiesheng Wu},
title = {When Cloud Storage Meets {RDMA}},
booktitle = {18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21)},
year = {2021},
isbn = {978-1-939133-21-2},
pages = {519--533},
url = {https://www.usenix.org/conference/nsdi21/presentation/gao},
publisher = {USENIX Association},
month = apr
}

Presentation Video