Jeeyun Kim, Pohang University of Science and Technology (POSTECH); Seonggyun Oh and Jungwoo Kim, Daegu Gyeongbuk Institute of Science and Technology; Jisung Park, Pohang University of Science and Technology (POSTECH); Jaeho Kim, Gyeongsang National University; Sungjin Lee, Pohang University of Science and Technology (POSTECH); Sam H. Noh, Virginia Tech
Log-structured systems have become the backbone of modern data-intensive applications thanks to their high write throughput. Their efficiency, however, is deteriorated by the write amplification factor (WAF) induced by garbage collection. Despite extensive studies, there still exists a wide gap between practice and optimality. In this paper, we bridge this gap with two key contributions. We first design NoDaP, a near-optimal oracle baseline that sets the upper bound for WAF reduction. Then, guided by insights from NoDaP, we propose DOGI, an oracle-inspired data placement technique that combines simple yet effective heuristics with lightweight machine learning. DOGI predicts invalidation times for data blocks with high accuracy, dynamically tunes group configurations, and finds the sweet spot between fine-grained data placement and misprediction penalty. Our experiments, using simulations and a prototype on a zoned device, show that DOGI reduces WAF by up to 23.2% while improving write throughput by up to 13.3% over the best-performing baseline.
FAST '26 Open Access Sponsored by
NetApp
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

author = {Jeeyun Kim and Seonggyun Oh and Jungwoo Kim and Jisung Park and Jaeho Kim and Sungjin Lee and Sam H. Noh},
title = {{DOGI}: Data Placement with {Oracle-Guided} Insights for {Log-Structured} Systems},
booktitle = {24th USENIX Conference on File and Storage Technologies (FAST 26)},
year = {2026},
isbn = {978-1-939133-53-3},
address = {Santa Clara, CA},
pages = {543--559},
url = {https://www.usenix.org/conference/fast26/presentation/kim-jeeyun},
publisher = {USENIX Association},
month = feb
}


