Biblio

Export 3 results:
Filters: Author is Yinmin Zhong  [Clear All Filters]
2024
Zhong Y, Liu S, Chen J, Hu J, Zhu Y, Liu X, Jin X, Zhang H.  2024.  DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving. 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). :193--210.
Jiang Z, Lin H, Zhong Y, Huang Q, Chen Y, Zhang Z, Peng Y, Li X, Xie C, Nong S et al..  2024.  MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs. 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). :745--760.
2023
Li Z, Zheng L, Zhong Y, Liu V, Sheng Y, Jin X, Huang Y, Chen Z, Zhang H, Gonzalez JE et al..  2023.  AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). :663--679.