MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs

TitleMegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Publication TypeConference Paper
Year of Publication2024
AuthorsJiang Z, Lin H, Zhong Y, Huang Q, Chen Y, Zhang Z, Peng Y, Li X, Xie C, Nong S, Jia Y, He S, Chen H, Bai Z, Hou Q, Yan S, Zhou D, Sheng Y, Jiang Z, Xu H, Wei H, Zhang Z, Nie P, Zou L, Zhao S, Xiang L, Liu Z, Li Z, Jia X, Ye J, Jin X, Liu X
Conference Name21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24)
Date Published04/2024
PublisherUSENIX Association
Conference LocationSanta Clara, CA
ISBN Number978-1-939133-39-7
URLhttps://www.usenix.org/conference/nsdi24/presentation/jiang-ziheng