Fractal: Fault-Tolerant Shell-Script Distribution

Zhicheng Huang, Ramiz Dundar, and Yizheng Xie, Brown University; Konstantinos Kallas, University of California, Los Angeles; Nikos Vasilakis, Brown University

This paper presents FRACTAL, a new system that offers fault tolerant distributed shell script execution for unmodified scripts. FRACTAL first distinguishes recoverable regions from side-effectful ones, and augments them with additional runtime support aimed at fault recovery. It employs precise dependency and progress tracking at the subgraph level to offer sound and efficient fault recovery. It minimizes the number of upstream regions that are re-executed during recovery and ensures exactly-once semantics upon recovery for downstream regions. Evaluation on 4- and 30-node clusters indicates average fault-free speedups of (1) >9.6x over Bash, a single-node shell-interpreter baseline, (2) >5.5x over Hadoop Streaming, a MapReduce system that supports language-agnostic third-party components, and (3) 17% over DiSh, a state-of-the-art fault-intolerant shell-script distribution system—all while recovering 7.8–16.4x faster than Hadoop Streaming in cases of faults.

NSDI '26 Open Access Sponsored by
King Abdullah University of Science and Technology (KAUST)

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {316090,
author = {Zhicheng Huang and Ramiz Dundar and Yizheng Xie and Konstantinos Kallas and Nikos Vasilakis},
title = {Fractal: {Fault-Tolerant} {Shell-Script} Distribution},
booktitle = {23rd USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year = {2026},
isbn = {978-1-939133-54-0},
address = {Renton, WA},
pages = {2339--2354},
url = {https://www.usenix.org/conference/nsdi26/presentation/huang},
publisher = {USENIX Association},
month = may
}

Presentation Video