REMIX: Efficient Range Query for LSM-trees

Authors: 

Wenshao Zhong, Chen Chen, and Xingbo Wu, University of Illinois at Chicago; Song Jiang, University of Texas at Arlington

Abstract: 

LSM-tree based key-value (KV) stores organize data in a multi-level structure for high-speed writes. Range queries on traditional LSM-trees must seek and sort-merge data from multiple table files on the fly, which is expensive and often leads to mediocre read performance. To improve range query efficiency on LSM-trees, we introduce a space-efficient KV index data structure, named REMIX, that records a globally sorted view of KV data spanning multiple table files. A range query on multiple REMIX-indexed data files can quickly locate the target key using a binary search, and retrieve subsequent keys in sorted order without key comparisons. We build RemixDB, an LSM-tree based KV-store that adopts a write-efficient compaction strategy and employs REMIXes for fast point and range queries. Experimental results show that REMIXes can substantially improve range query performance in a write-optimized LSM-tree based KV-store.

FAST '21 Open Access Sponsored by NetApp

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {264856,
author = {Wenshao Zhong and Chen Chen and Xingbo Wu and Song Jiang},
title = {{REMIX}: Efficient Range Query for {LSM-trees}},
booktitle = {19th USENIX Conference on File and Storage Technologies (FAST 21)},
year = {2021},
isbn = {978-1-939133-20-5},
pages = {51--64},
url = {https://www.usenix.org/conference/fast21/presentation/zhong},
publisher = {USENIX Association},
month = feb
}

Presentation Video