Provenance
Methodology and software for IT validation and verification of the provenance and reliability of data arising from complex processes.
An architecture and a methodology for recording provenance
information and creating provenance-capable applications was developed
in the Grid Provenance EU
project. In this context, “provenance information” of data refers to
information on the process that resulted in these data. This
information is important or even indispensable for a number of
applications, e.g. in order to have a reliable documentation of the
individual steps performed in engineering calculations, or for secure
information on the location of and changes to data in medical data
processing and management. Therefore, the demonstration applications in
the provenance project included a complex manoeuvre simulation from the
aeronautics field, an organ transplant management application and a
patient data management application from the field of
medicine.
The architecture developed in the provenance project was designed as a Service-Oriented Architecture. The provenance information to be recorded from the application is saved in a “provenance store” and can be retrieved and evaluated later. A programming library (“client side library”) was developed in Java for simple recording of this information from within an application. The following illustration shows the provenance architecture:

In the provenance project, a workflow management system for complex engineering simulations was extended at DLR to include provenance recording, and evaluated using various workflows (parameter variation and simulation of complex flight manoeuvres) [PDF]. The resulting knowledge is applied in the AeroGrid project.
In addition to the existing Java library, DLR is currently developing a client-side provenance library in Python, enabling its integration into a number of applications, particularly the DataFinder.