DistVS: Large-scale Vector Search with Compute-Memory Disaggregation

Peiqi Yin, The Chinese University of Hong Kong; Xiao Yan, Wuhan University; Shiyuan Deng, Huawei Cloud; Hui Li, Yifan Zhu, and Xiangyu Zhi, The Chinese University of Hong Kong; Jingqi Mao, Ran Xu, and Wenliang Zhang, Huawei Cloud; James Cheng, The Chinese University of Hong Kong

Similarity-based vector search, also known as ANNS, underlies many important applications such as content search, recommender system, and retrieval-augmented generation (RAG). However, vector search has a high storage demand due to large datasets and incurs costly IOs for its fine-grained access to the vectors and index. We observe that a compute-memory disaggregation architecture can tackle these challenges and design the DistVS system with a three-tier storage layout. In particular, the compute servers keep the small but low-precision compressed vectors, a more capacious memory server stores larger high-precision compressed vectors along with the index, while the full-precision exact vectors are kept on SSDs. The idea is to progressively prune the vector accesses along the low-high-full precisions from the compute servers to the SSDs, aligning with the storage hierarchy of memory-network-disk with gradually larger capacity but higher IO cost. To effectively utilize the three vector previsions, we design an algorithm called PRESS to conduct vector search. To improve performance, DistVS incorporates system optimizations including asynchronous execution, RDMA IO batching, and decoupled re-ranking. We compare DistVS with state-of-the-art disk-based and distributed vector search systems and show that DistVS consistently outperforms them and usually improves their query throughput by over 40%.

NSDI '26 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {316742,
author = {Peiqi Yin and Xiao Yan and Shiyuan Deng and Hui Li and Yifan Zhu and Xiangyu Zhi and Jingqi Mao and Ran Xu and Wenliang Zhang and James Cheng},
title = {{DistVS}: Large-scale Vector Search with {Compute-Memory} Disaggregation},
booktitle = {23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year = {2026},
isbn = {978-1-939133-54-0},
address = {Renton, WA},
pages = {449--467},
url = {https://www.usenix.org/conference/nsdi26/presentation/yin},
publisher = {USENIX Association},
month = may
}

Presentation Video