Biblio

Export 3 results:

DBLP
BibTeX

Filters: Author is Yinmin Zhong [Clear All Filters]

2024

Zhong Y, Liu S, Chen J, Hu J, Zhu Y, Liu X, Jin X, Zhang H. 2024. DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving. 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). :193--210.

Jiang Z, Lin H, Zhong Y, Huang Q, Chen Y, Zhang Z, Peng Y, Li X, Xie C, Nong S et al.. 2024. MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs. 21st USENIX Symposium on Networked Systems Design and Implementation (NSDI 24). :745--760.

2023

Li Z, Zheng L, Zhong Y, Liu V, Sheng Y, Jin X, Huang Y, Chen Z, Zhang H, Gonzalez JE et al.. 2023. AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). :663--679.