TopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs

TitleTopoOpt: Co-optimizing Network Topology and Parallelization Strategy for Distributed Training Jobs
Publication TypeConference Paper
Year of Publication2023
AuthorsWang W, Khazraee M, Zhong Z, Ghobadi M, Jia Z, Mudigere D, Zhang Y, Kewitsch A
Conference Name20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)
Date Published04/2023
PublisherUSENIX Association
Conference LocationBoston, MA
ISBN Number978-1-939133-33-5
URLhttps://www.usenix.org/conference/nsdi23/presentation/wang-weiyang