Data catalog project—A browsable, searchable, metadata system
- Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
- European Atomic Energy Community (Euratom), Padova (Italy)
Modern experiments are typically conducted by large, extended groups, where researchers rely on other team members to produce much of the data they use. The experiments record very large numbers of measurements that can be difficult for users to find, access and understand. Here. we are developing a system for users to annotate their data products with structured metadata, providing data consumers with a discoverable, browsable data index. Machine understandable metadata captures the underlying semantics of the recorded data, which can then be consumed by both programs, and interactively by users. Collaborators can use these metadata to select and understand recorded measurements. The data catalog project is a data dictionary and index which enables users to record general descriptive metadata, use cases and rendering information as well as providing them a transparent data access mechanism (URI). Users describe their diagnostic including references, text descriptions, units, labels, example data instances, author contact information and data access URIs. The list of possible attribute labels is extensible, but limiting the vocabulary of names increases the utility of the system. The data catalog is focused on the data products and complements process-based systems like the Metadata Ontology Provenance project [Greenwald, 2012; Schissel, 2015]. This system can be coupled with MDSplus to provide a simple platform for data driven display and analysis programs. Sites which use MDSplus can describe tree branches, and if desired create ‘processed data trees’ with homogeneous node structures for measurements. Sites not currently using MDSplus can either use the database to reference local data stores, or construct an MDSplus tree whose leaves reference the local data store. A data catalog system can provide a useful roadmap of data acquired from experiments or simulations making it easier for researchers to find and access important data and understand the meaning of the data and how it was obtained. This is particularly useful in research facilities that study the results of many different experiments or simulations and may not know the intricacies of the data organization in use where the data was generated. It is also possible to store a local copy of key data items in local MDSplus trees and then add processed data to the local catalog.
- Research Organization:
- Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
- Sponsoring Organization:
- USDOE Office of Science (SC), Fusion Energy Sciences (FES)
- Grant/Contract Number:
- SC0012470
- OSTI ID:
- 1897960
- Alternate ID(s):
- OSTI ID: 1399141
- Journal Information:
- Fusion Engineering and Design, Vol. 112; ISSN 0920-3796
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
The Java interface of MDSplus: towards a unified approach for local and remote data access
|
journal | August 2000 |
The MDSplus data acquisition system, current status and future directions
|
journal | January 1999 |
A metadata catalog for organization and systemization of fusion simulation data
|
journal | December 2012 |
Data analysis software tools for enhanced collaboration at the DIII–D National Fusion Facility
|
journal | August 2000 |
Similar Records
MDSplus data acquisition system
Reducing Information Overload in Large Seismic Data Sets