Removing Double-Logging with Passive Data Persistence in LSM-tree based Relational Databases

Authors: 

Kecheng Huang, Shandong University, The Chinese University of Hong Kong; Zhaoyan Shen and Zhiping Jia, Shandong University; Zili Shao, The Chinese University of Hong Kong; Feng Chen, Louisiana State University

Abstract: 

Storage engine is a crucial component in relational databases (RDBs). With the emergence of Internet services and applications, a recent technical trend is to deploy a Log-structured Merge Tree (LSM-tree) based storage engine. Although such an approach can achieve high performance and efficient storage space usage, it also brings a critical double-logging problem——In LSM-tree based RDBs, both the upper RDB layer and the lower storage engine layer implement redundant logging facilities, which perform synchronous and costly I/Os for data persistence. Unfortunately, such “double protection” does not provide extra benefits but only incurs heavy and unnecessary performance overhead.

In this paper, we propose a novel solution, called Passive Data Persistence Scheme (PASV), to address the double-logging problem in LSM-tree based RDBs. By completely removing Write-ahead Log (WAL) in the storage engine layer, we develop a set of mechanisms, including a passive memory buffer flushing policy, an epoch-based data persistence scheme, and an optimized partial data recovery process, to achieve reliable and low-cost data persistence during normal runs and also fast and efficient recovery upon system failures. We implement a fully functional, open-sourced prototype of PASV based on Facebook’s MyRocks. Evaluation results show that our solution can effectively improve system performance by increasing throughput by up to 49.9% and reducing latency by up to 89.3%, and it also saves disk I/Os by up to 42.9% and reduces recovery time by up to 4.8%.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {277848,
author = {Kecheng Huang and Zhaoyan Shen and Zhiping Jia and Zili Shao and Feng Chen},
title = {Removing {Double-Logging} with Passive Data Persistence in {LSM-tree} based Relational Databases},
booktitle = {20th USENIX Conference on File and Storage Technologies (FAST 22)},
year = {2022},
isbn = {978-1-939133-26-7},
address = {Santa Clara, CA},
pages = {101--116},
url = {https://www.usenix.org/conference/fast22/presentation/huang},
publisher = {USENIX Association},
month = feb
}

Presentation Video