Yiran Lei, Carnegie Mellon University and MangoBoost; Dongjoo Lee, MangoBoost; Liangyu Zhao, University of Washington; Daniar Kurniawan, Chanmyeong Kim, Heetaek Jeong, Changsu Kim, and Hyeonseong Choi, MangoBoost; Liangcheng Yu, University of Pennsylvania; Arvind Krishnamurthy, University of Washington; Justine Sherry, Carnegie Mellon University; Eriko Nurvitadhi, MangoBoost
All-to-All(v) communication is a critical primitive in modern machine learning workloads, particularly mixture-of-experts (MoE) models. Unfortunately, efficient scheduling is challenging due to workload skew, heterogeneous two-tier fabrics, and incast congestion, compounded by the dynamic nature of MoE workloads, where traffic shifts every few hundred milliseconds. Existing schedulers are hardly scalable, incurring seconds to hours of synthesis time, making them impractical.
We present FAST, an efficient All-to-All(v) scheduler. FAST addresses skew through intra-server rebalancing and enforces balanced, one-to-one scale-out transfers that avoid incast. Evaluated extensively on both NVIDIA H200 and AMD MI300X clusters, FAST consistently outperforms state-of-the-art solutions on skewed workloads while reducing synthesis time by orders of magnitude.
NSDI '26 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

author = {Yiran Lei and Dongjoo Lee and Liangyu Zhao and Daniar Kurniawan and Chanmyeong Kim and Heetaek Jeong and Changsu Kim and Hyeonseong Choi and Liangcheng Yu and Arvind Krishnamurthy and Justine Sherry and Eriko Nurvitadhi},
title = {{FAST}: An Efficient Scheduler for {All-to-All} {GPU} Communication},
booktitle = {23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year = {2026},
isbn = {978-1-939133-54-0},
address = {Renton, WA},
pages = {2515--2531},
url = {https://www.usenix.org/conference/nsdi26/presentation/lei-yiran},
publisher = {USENIX Association},
month = may
}
