Skip to main content
USENIX
  • Conferences
  • Students
Sign in
  • Overview
  • Workshop Program
  • Call for Papers

connect with us


  •  Twitter
  •  Facebook
  •  LinkedIn
  •  Google+
  •  YouTube

twitter

Tweets by @usenix

usenix conference policies

  • Event Code of Conduct
  • Conference Network Policy
  • Statement on Environmental Responsibility Policy

You are here

Home » Retrospective Provenance Without a Runtime Provenance Recorder
Tweet

connect with us

Retrospective Provenance Without a Runtime Provenance Recorder

Authors: 

Timothy McPhillips, University of Illinois at Urbana-Champaign; Shawn Bowers, Gonzaga University; Khalid Belhajjame, Paris Dauphine University; Bertram Ludäscher, University of Illinois at Urbana-Champaign

Abstract: 

The YesWork ow (YW) toolkit aims to provide users of scripting languages such as Python, Perl, and R with many of the benefits of scientific workflow automation. YW requires neither the use of a workflow engine nor the overhead of adapting or instrumenting code to run in such a system. Instead, YW enables scientists to annotate their scripts with special comments that reveal the main computational blocks and dataflow dependencies otherwise implicit in scripts. YW tools extract and analyze these comments, represent scripts in terms of entities based on a typical scientific workflow model, and provide graphical workflow views (i.e., prospective provenance) of scripts. In this paper, we present a new extension of YW for inferring retrospective provenance from script executions without relying on a runtime provenance recorder. Instead we exploit the common practice of scientists to embed important pieces of provenance in directory structures and file names. For such “provenance-friendly” data organizations, we offer a new annotation mechanism based on URI templates. YW uses these to link conceptual-level prospective provenance with data files created at runtime, resulting in a powerful, integrated model of prospective and retrospective provenance.We present scientifically meaningful retrospective provenance queries for investigating an execution of a data acquisition workflow implemented as a Python script, and show how these queries can be evaluated using the YW toolkit.

Timothy McPhillips, University of Illinois at Urbana-Champaign

Shawn Bowers, Gonzaga University

Khalid Belhajjame, Université Paris-Dauphine

Bertram Ludäscher, University of Illinois at Urbana-Champaign

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {192094,
author = {Timothy McPhillips and Shawn Bowers and Khalid Belhajjame and Bertram Lud{\"a}scher},
title = {Retrospective Provenance Without a Runtime Provenance Recorder},
booktitle = {7th USENIX Workshop on the Theory and Practice of Provenance (TaPP 15)},
year = {2015},
address = {Edinburgh, Scotland},
url = {https://www.usenix.org/conference/tapp15/workshop-program/presentation/mcphillips},
publisher = {USENIX Association},
month = jul
}
Download
McPhillips PDF
  • Log in or    Register to post comments

© USENIX

  • Privacy Policy
  • Contact Us