Light-Dedup: A Light-weight Inline Deduplication Framework for Non-Volatile Memory File Systems

Authors: 

Jiansheng Qiu, Yanqi Pan, Wen Xia, Xiaojia Huang, Wenjun Wu, Xiangyu Zou, and Shiyi Li, Harbin Institute of Technology, Shenzhen; Yu Hua, Huazhong University of Science and Technology

Abstract: 

Emerging NVM is promising to become the next-generation storage media. However, its high cost hinders its development. Recent deduplication researches in NVM file systems demonstrate that NVM's cost can be reduced by eliminating redundant data blocks, but their design lacks complete insights into NVM's I/O mechanisms.

We propose Light-Dedup, a light-weight inline deduplication framework for NVM file systems that performs fast block-level deduplication while taking NVM's I/O mechanisms into consideration. Specifically, Light-Dedup proposes Light-Redundant-Block-Identifier (LRBI), which combines non-cryptographic hash with a speculative-prefetch-based byte-by-byte content-comparison approach. LRBI leverages the memory interface of NVM to enable asynchronous reads by speculatively prefetching in-NVM data blocks into the CPU/NVM buffers. Thus, NVM's read latency seen by content-comparison is markedly reduced due to buffer hits. Moreover, Light-Dedup adopts an in-NVM Light-Meta-Table (LMT) to store deduplication metadata and collaborate with LRBI. LMT is organized in the region granularity, which significantly reduces metadata I/O amplification and improves deduplication performance.

Experimental results suggest Light-Dedup achieves 1.01--8.98× I/O throughput over the state-of-the-art NVM deduplication file systems. Here, the speculative prefetch technique used in LRBI improves Light-Dedup by 0.3--118%. In addition, the region-based layout of LMT reduces metadata read/write amplification from 19.35× /9.86× to 6.10× /3.43× in our hand-crafted aging workload.

USENIX ATC '23 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

This content is available to:

BibTeX
@inproceedings {288695,
author = {Jiansheng Qiu and Yanqi Pan and Wen Xia and Xiaojia Huang and Wenjun Wu and Xiangyu Zou and Shiyi Li and Yu Hua},
title = {{Light-Dedup}: A Light-weight Inline Deduplication Framework for {Non-Volatile} Memory File Systems},
booktitle = {2023 USENIX Annual Technical Conference (USENIX ATC 23)},
year = {2023},
isbn = {978-1-939133-35-9},
address = {Boston, MA},
pages = {101--116},
url = {https://www.usenix.org/conference/atc23/presentation/qiu-jiansheng},
publisher = {USENIX Association},
month = jul
}

Presentation Video