Datalog as a Lingua Franca for Provenance Querying and Reasoning

Datalog as a Lingua Franca for Provenance Querying and Reasoning

Saumen Dey and Sven Köhler, UC Davis; Shawn Bowers, Gonzaga University; Bertram Ludäscher, UC Davis

Provenance, i.e., the lineage and processing history of data, has become increasingly important within scientific workflow systems. Provenance information can be used, e.g., to explain, debug, and reproduce the results of computational experiments as well as to determine the validity and quality of data products. Standard models for representing provenance information (such as OPM) largely focus on providing a minimal, common set of observables and constraints (in terms of causal and temporal relationships). For scientific workflow applications, however, the workflow itself and the corresponding (implicit) contraints on provenance relationships are often essential for interpreting and querying provenance information. In this paper, we propose Datalog as a “lingua franca” for representing, querying, and specifying integrity constraints over provenance information, and introduce a unifying provenance model for specifying workflows, traces, and temporal constraints. We also demonstrate advantages of using Datalog together with the unified model through a number of examples.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX

@inproceedings {179552,
title = {Datalog as a Lingua Franca for Provenance Querying and Reasoning},
booktitle = {4th USENIX Workshop on the Theory and Practice of Provenance (TaPP 12)},
year = {2012},
address = {Boston, MA},
url = {https://www.usenix.org/conference/tapp12/workshop-program/presentation/dey},
publisher = {USENIX Association},
month = jun
}

USENIX Conference Policies

Datalog as a Lingua Franca for Provenance Querying and Reasoning

Open Access Media

Presentation Video

Presentation Audio