Provenance Tools help with the collection and display of data products

Several provenance collection and display tools have resulted from provenance research at ITSC in conjunction with the AMSR-E SIPS.The Earth science Library for Processing History (ELPH) is a provenance collection software library developed in Perl for integration into the SIPS processing framework. ELPH follows a provenance capture model based on the Open Provenance Model to log “consumed”, “invoked”, and “produced” events and assign unique URNs to all artifacts (i.e., data files or software processes). ELPH logs conform to an internal XML provenance log schema, suitable for ingest into the AMSR-E provenance repository. While ELPH is tailored to the AMSR-E SIPS, it can be used in other contexts to instrument script-driven processing, or as a reference implementation for provenance logging.

An on-line Context Metadata Entry Form allows data managers to provide a concise summary of context information to include algorithm versions and descriptions, geophysical parameters and data fields encoded in science products, ancillary files used in processing, flag values with explanations, and pointers to full documentation from a variety of existing information sources (e.g., Algorithm Theoretical Basis Documents, software release notes, user guide documents). This high level information pertains to the series of data files that comprise a given data product, and provides context for the provenance details available for each individual file. The information is stored in the AMSR-E provenance repository for display in the Provenance Browser. A translation utility allows users to export the lineage subset of this collection-level information as an ISO-compliant XML record.

The AMSR-E Provenance Browser is a customized solution for the AMSR-E SIPS built on the Drupal content management system. This Drupal profile consists of a provenance repository and the Provenance Browser user interface. This tool provides a number of ways to explore the provenance repository, including browsing a list of files or a series of images, searching for a specific file, or requesting general provenance and context information for a data collection. Once the user has selected a specific data file or collection, the browser aggregates the basic processing graph, metadata, and context information. Users can trace the full lineage of a data product by viewing the provenance information for each input file.

Software library automatically records provenance information during data product generation
Metadata entry form captures additional context information at the data collection level
Provenance Browser allows user to explore full provenance and context information in a variety of ways
