PilotFish: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training

Authors: 

Wei Zhang and Binghao Chen, Shanghai Jiao Tong University; Zhenhua Han, Microsoft Research; Quan Chen, Shanghai Jiao Tong University; Peng Cheng, Fan Yang, Ran Shu, and Yuqing Yang, Microsoft Research; Minyi Guo, Shanghai Jiao Tong University

Abstract: 

Cloud gaming services have become important workloads in cloud datacenter. However, our investigation shows that a cloud gaming service cannot saturate the modern cloud GPUs. One way to improve the GPU utilization is to co-locate multiple workloads within one GPU, which is challenging for cloud gaming due to its highly fluctuated and unpredictable GPU usage pattern. In this paper, we present PilotFish, a high-performance system that harvests the free GPU cycles of cloud gaming with deep learning (DL) training, while incurring almost zero interference to cloud gaming. We co-locate DL training jobs with cloud gaming because they have stable and predictable workloads and have no strict latency requirement. In more detail, Pilotfish captures the idle periods of the game's GPU usage with its low-overhead instrumentation to graphic libraries in sub-millisecond granularity. To avoid the potential interference to cloud gaming, PilotFish schedules training computation kernels only when they can finish before the idle GPU periods, and preempts straggler kernels running longer than expected. Our evaluation on popular cloud games and DL models shows PilotFish can harvest up to 85.1% of the idle GPU time from cloud gaming with no interference.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {280782,
author = {Wei Zhang and Binghao Chen and Zhenhua Han and Quan Chen and Peng Cheng and Fan Yang and Ran Shu and Yuqing Yang and Minyi Guo},
title = {{PilotFish}: Harvesting Free Cycles of Cloud Gaming with Deep Learning Training},
booktitle = {2022 USENIX Annual Technical Conference (USENIX ATC 22)},
year = {2022},
isbn = {978-1-939133-29-60},
address = {Carlsbad, CA},
pages = {217--232},
url = {https://www.usenix.org/conference/atc22/presentation/zhang-wei},
publisher = {USENIX Association},
month = jul,
}

Presentation Video