You are here
ReproZip: Using Provenance to Support Computational Reproducibility
Fernando Chirigati, Polytechnic Institute of NYU; Dennis Shasha, New York University; Juliana Freire, Polytechnic Institute of NYU
We describe ReproZip, a tool that makes it easier for authors to publish reproducible results and for reviewers to validate these results. By tracking operating system calls, ReproZip systematically captures detailed provenance of existing experiments, including data dependencies, libraries used, and conﬁguration parameters. This information is combined into a package that can be installed and run on a different environment. An important goal that we have for ReproZip is usability. Besides simplifying the creation of reproducible results, the system also helps reviewers. Because the package is self contained, reviewers need not install any additional software to run the experiments. In addition, ReproZip generates a workﬂow speciﬁcation for the experiment. This not only enables reviewers to execute this speciﬁcation within a workﬂow system to explore the experiment and try different conﬁgurations, but also the provenance kept by the workﬂow system can facilitate communication between reviewers and authors.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.