Jingwei Xu, Shanghai Jiao Tong University and Huawei Technologies; Junbin Kang, Huawei Technologies; Mingkai Dong, Shanghai Jiao Tong University; Mingyu Liu, Lu Zhang, Shaohong Guo, and Ziyan Qiu, Huawei Technologies; Mingzhen You and Ziyi Tian, Shanghai Jiao Tong University; Anqi Yu, Tianhong Ding, and Xinwei Hu, Huawei Technologies; Haibo Chen, Shanghai Jiao Tong University and Huawei Technologies
Client-side metadata caching has long been considered an effective method for accelerating metadata operations in distributed file systems (DFSs). However, we have found that client-side state (e.g., caching) is not only ineffective but also consumes valuable memory resources in the deep learning pipelines. We thus propose FalconFS, a DFS optimized for deep learning pipelines with the stateless-client architecture. Specifically, instead of performing client-side path resolution and caching, FalconFS efficiently resolves paths on the server side using hybrid metadata indexing and lazy namespace replication. FalconFS also boosts server concurrency with concurrent request merging and provides easy deployment with VFS shortcut. Evaluations against CephFS and Lustre show that FalconFS achieves up to 5.72× throughput for small file read/write and up to 12.81× throughput for deep learning model training. FalconFS has been running in Huawei autonomous driving system's production environment with 10,000 NPUs for one year and has been open-sourced.
NSDI '26 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Jingwei Xu and Junbin Kang and Mingkai Dong and Mingyu Liu and Lu Zhang and Shaohong Guo and Ziyan Qiu and Mingzhen You and Ziyi Tian and Anqi Yu and Tianhong Ding and Xinwei Hu and Haibo Chen},
title = {{FalconFS}: Distributed File System for {Large-Scale} Deep Learning Pipeline},
booktitle = {23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year = {2026},
isbn = {978-1-939133-54-0},
address = {Renton, WA},
pages = {431--447},
url = {https://www.usenix.org/conference/nsdi26/presentation/xu},
publisher = {USENIX Association},
month = may
}


