Scaling Large Production Clusters with Partitioned Synchronization

Authors: 

Yihui Feng, Alibaba Group; Zhi Liu, Yunjian Zhao, Tatiana Jin, and Yidi Wu, The Chinese University of Hong Kong; Yang Zhang, Alibaba Group; James Cheng, The Chinese University of Hong Kong; Chao Li and Tao Guan, Alibaba Group

Awarded Best Paper!

Abstract: 

The scale of computer clusters has grown significantly in recent years. Today, a cluster may have 100 thousand machines and execute billions of tasks, especially short tasks, each day. As a result, the scheduler, which manages resource utilization in a cluster, also needs to be upgraded to work at a much larger scale. However, upgrading the scheduler—a central system component—in a large production cluster is a daunting task as we need to ensure the cluster's stability and robustness, e.g., user transparency should be guaranteed, and other cluster components and the existing scheduling policies need to remain unchanged. We investigated existing scheduler designs and found that most cannot handle the scale of our production clusters or may endanger their robustness. We analyzed one most suitable design that follows a shared-state architecture, and its limitations led us to a fine-grained staleness-aware state sharing design, called partitioned synchronization (ParSync). ParSync features the simplicity required for maintaining the robustness of a production cluster, while achieving high scheduling efficiency and quality in scaling. ParSync has been deployed and is running stably in our production clusters.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {273851,
author = {Yihui Feng and Zhi Liu and Yunjian Zhao and Tatiana Jin and Yidi Wu and Yang Zhang and James Cheng and Chao Li and Tao Guan},
title = {Scaling Large Production Clusters with Partitioned Synchronization},
booktitle = {2021 USENIX Annual Technical Conference (USENIX ATC 21)},
year = {2021},
isbn = {978-1-939133-23-6},
pages = {81--97},
url = {https://www.usenix.org/conference/atc21/presentation/feng-yihui},
publisher = {USENIX Association},
month = jul
}

Presentation Video