AntMan: Dynamic Scaling on GPU Clusters for Deep Learning

TitleAntMan: Dynamic Scaling on GPU Clusters for Deep Learning
Publication TypeConference Paper
Year of Publication2020
AuthorsXiao W, Ren S, Li Y, Zhang Y, Hou P, Li Z, Feng Y, Lin W, Jia Y
Conference Name14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20)
Date Published11/2020
PublisherUSENIX Association
ISBN Number978-1-939133-19-9
URLhttps://www.usenix.org/conference/osdi20/presentation/xiao