{MatrixKV}: Reducing Write Stalls and Write Amplification in {LSM-tree} Based {KV} Stores with Matrix Container in {NVM}

Ting Yao; Yiwen Zhang; Jiguang Wan; Qiu Cui; Liu Tang; Hong Jiang; Changsheng Xie; Xubin He

Authors:

Ting Yao, Yiwen Zhang, and Jiguang Wan, Huazhong University of Science and Technology; Qiu Cui and Liu Tang, PingCAP; Hong Jiang, UT Arlington; Changsheng Xie, Huazhong University of Science and Technology; Xubin He, Temple University

Abstract:

Popular LSM-tree based key-value stores suffer from suboptimal and unpredictable performance due to write amplification and write stalls that cause application performance to periodically drop to nearly zero. Our preliminary experimental studies reveal that (1) write stalls mainly stem from the significantly large amount of data involved in each compaction between L₀–L₁ (i.e., the first two levels of LSM-tree), and (2) write amplification increases with the depth of LSM-trees. Existing works mainly focus on reducing write amplification, while only a couple of them target mitigating write stalls.

In this paper, we exploit non-volatile memory (NVM) to address these two limitations and propose MatrixKV, a new LSM-tree based KV store for systems with multi-tier DRAM-NVM-SSD storage. MatrixKV's design principles include performing smaller and cheaper L₀–L₁ compaction to reduce write stalls while reducing the depth of LSM-trees to mitigate write amplification. To this end, four novel techniques are proposed. First, we relocate and manage the L₀ level in NVM with our proposed matrix container. Second, the new column compaction is devised to compact L₀ to L₁ at fine-grained key ranges, thus substantially reducing the amount of compaction data. Third, MatrixKV increases the width of each level to decrease the depth of LSM-trees thus mitigating write amplification. Finally, the cross-row hint search is introduced for the matrix container to keep adequate read performance. We implement MatrixKV based on RocksDB and evaluate it on a hybrid DRAM/NVM/SSD system using Intel's latest 3D Xpoint NVM device Optane DC PMM. Evaluation results show that, with the same amount of NVM, MatrixKV achieves 5× and 1.9× lower 99^th percentile latencies, and 3.6× and 2.6× higher random write throughput than RocksDB and the state-of-art LSM-based KVS NoveLSM respectively.

Ting Yao, Huazhong University of Science and Technology

Yiwen Zhang, Huazhong University of Science and Technology

Jiguang Wan, Huazhong University of Science and Technology

Qiu Cui, PingCAP

Liu Tang, PingCAP

Hong Jiang, UT Arlington

Changsheng Xie, Huazhong University of Science and Technology

Xubin He, Temple University

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {254461,
author = {Ting Yao and Yiwen Zhang and Jiguang Wan and Qiu Cui and Liu Tang and Hong Jiang and Changsheng Xie and Xubin He},
title = {{MatrixKV}: Reducing Write Stalls and Write Amplification in {LSM-tree} Based {KV} Stores with Matrix Container in {NVM}},
booktitle = {2020 USENIX Annual Technical Conference (USENIX ATC 20)},
year = {2020},
isbn = {978-1-939133-14-4},
pages = {17--31},
url = {https://www.usenix.org/conference/atc20/presentation/yao},
publisher = {USENIX Association},
month = jul
}

Download

Yao PDF

View the slides

MatrixKV: Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM