Policy-Based Integration of Provenance Metadata


A. Gehani, D. Tariq, B. Baig and T. Malik, “Policy-Based Integration of Provenance Metadata,” 2011 IEEE International Symposium on Policies for Distributed Systems and Networks, Pisa, Italy, 2011, pp. 149-152, doi: 10.1109/POLICY.2011.12.


Reproducibility has been a cornerstone of the scientific method for hundreds of years. The range of sources from which data now originates, the diversity of the individual manipulations performed, and the complexity of the orchestrations of these operations all limit the reproducibility that a scientist can ensure solely by manually recording their actions. We use an architecture where aggregation, fusion, and composition policies define how provenance records can be automatically merged to facilitate the analysis and reproducibility of experiments. We show that the overhead of collecting and storing provenance metadata can vary dramatically depending on the policy used to integrate it.

Keywords: Proteins, Bioinformatics, Genomics, Databases, Materials, Kernel

