Biblio

Export 6 results:
Filters: Author is Hao Zhang  [Clear All Filters]
2024
Zhong Y, Liu S, Chen J, Hu J, Zhu Y, Liu X, Jin X, Zhang H.  2024.  DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving. 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). :193--210.
2023
Li Z, Zheng L, Zhong Y, Liu V, Sheng Y, Jin X, Huang Y, Chen Z, Zhang H, Gonzalez JE et al..  2023.  AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving. 17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23). :663--679.
2022
Zheng L, Li Z, Zhang H, Zhuang Y, Chen Z, Huang Y, Wang Y, Xu Y, Zhuo D, Xing EP et al..  2022.  Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning. 16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22). :559--578.
2021
Qiao A, Choe SKeun, Subramanya SJayaram, Neiswanger W, Ho Q, Zhang H, Ganger GR, Xing EP.  2021.  Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning. 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21). :1--18.
2018
Xu S, Zhang H, Neubig G, Dai W, Kim JKyu, Deng Z, Ho Q, Yang G, Xing EP.  2018.  Cavs: An Efficient Runtime System for Dynamic Neural Networks. 2018 USENIX Annual Technical Conference (USENIX ATC 18). :937--950.