Om Rameshwar Gatla, Muhammad Hameed, and Mai Zheng, New Mexico State University; Viacheslav Dubeyko, Adam Manzanares, Filip Blagojevic, Cyril Guyot, and Robert Mateescu, Western Digital Research
File systems may become corrupted for many reasons despite various protection techniques. Therefore, most file systems come with a checker to recover the file system to a consistent state. However, existing checkers are commonly assumed to be able to complete the repair without interruption, which may not be true in practice.
In this work, we demonstrate via fault injection experiments that checkers of widely used file systems may leave the file system in an uncorrectable state if the repair procedure is interrupted unexpectedly. To address the problem, we first fix the ordering issue in the undo logging of e2fsck, and then build a general logging library (i.e., rfsck-lib) for strengthening checkers. To demonstrate the practicality, we integrate rfsck-lib with existing checkers and create two new checkers: (1) rfsck-ext, a robust checker for Ext-family file systems, and (2) rfsck-xfs, a robust checker for XFS file system, both of which require only tens of lines of modification to the original versions. Both rfsck-ext and rfsck-xfs are resilient to faults in our experiments. Also, both checkers incur reasonable performance overhead (i.e., up to 12%) comparing to the original unreliable versions. Moreover, rfsck-ext outperforms the patched e2fsck by up to nine times while achieving the same level of robustness.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.