DataFinder
The DataFinder is a lightweight application software for managing technical and scientific data. It was developed to manage large amounts of data, and allows data to be stored using a number of different storage interfaces (e.g. WebDAV, FTP, GridFTP, SRB, OpenAFS, or TSM). The structure of the data and descriptive meta data are stored in XML format on the central server and can be edited using the standardised WebDAV protocol.
The DataFinder’s main functionalities are:
- Data up- and downloading
- Applying standardised and user-defined meta information to
data
- A search function where several search terms can be combined
- Script processing to automate workflows (e.g. automatic up- and
downloading or calculations).
Managing the data structures at a central location and describing the data by standardised meta information makes it significantly easier to find data and therefore avoid duplicate work. Users can also use partial results or input data from other calculations already available on the servers for their current work without having to generate the same data again. The flexible meta data concept further allows users to add additional meta information to the data on the servers.
The DataFinder user interface consists of a
platform-independent user client that allows users to navigate through
the existing data, search for data, create and manage meta information
for all data, and execute scripts stored locally or on the server. The
DataFinder client was developed in Python and the Qt GUI library and
therefore offers a platform-typical look and feel on any
platform.
On the server side, data management solutions using the DataFinder
are based on open and flexible standards such as the WebDAV protocol, and are therefore
easily extendable and flexible. Currently supported WebDAV servers are
the Tamino XML server by Software AG as a commercial solution, and the
Catacomb WebDAV server, which
is the product of an open source project. These servers allow
server-side searching and automatic versioning of documents and
data.
The DataFinder is developed and enhanced in the D-Grid Integration Project (Data Management Area of Specialisation) and provided as an easy-to-use tool for scientific data management in grids.