LLMFabric: Unifying Decentralized HPC Clusters for Heterogeneous LLM Serving

Xiaozhe Yao, ETH Zurich; Youhe Jiang, University of Cambridge; Ilia Badanin, EPFL; Qinghao Hu, MIT; Binhang Yuan, HKUST; Imanol Schlag, ETH Zurich; Eiko Yoneki, University of Cambridge; Ana Klimovic, ETH Zurich

Category: 
Operational Systems Paper