skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Data catalog project—A browsable, searchable, metadata system

Journal Article · · Fusion Engineering and Design
 [1];  [1];  [1];  [2]
  1. Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
  2. European Atomic Energy Community (Euratom), Padova (Italy)

Modern experiments are typically conducted by large, extended groups, where researchers rely on other team members to produce much of the data they use. The experiments record very large numbers of measurements that can be difficult for users to find, access and understand. Here. we are developing a system for users to annotate their data products with structured metadata, providing data consumers with a discoverable, browsable data index. Machine understandable metadata captures the underlying semantics of the recorded data, which can then be consumed by both programs, and interactively by users. Collaborators can use these metadata to select and understand recorded measurements. The data catalog project is a data dictionary and index which enables users to record general descriptive metadata, use cases and rendering information as well as providing them a transparent data access mechanism (URI). Users describe their diagnostic including references, text descriptions, units, labels, example data instances, author contact information and data access URIs. The list of possible attribute labels is extensible, but limiting the vocabulary of names increases the utility of the system. The data catalog is focused on the data products and complements process-based systems like the Metadata Ontology Provenance project [Greenwald, 2012; Schissel, 2015]. This system can be coupled with MDSplus to provide a simple platform for data driven display and analysis programs. Sites which use MDSplus can describe tree branches, and if desired create ‘processed data trees’ with homogeneous node structures for measurements. Sites not currently using MDSplus can either use the database to reference local data stores, or construct an MDSplus tree whose leaves reference the local data store. A data catalog system can provide a useful roadmap of data acquired from experiments or simulations making it easier for researchers to find and access important data and understand the meaning of the data and how it was obtained. This is particularly useful in research facilities that study the results of many different experiments or simulations and may not know the intricacies of the data organization in use where the data was generated. It is also possible to store a local copy of key data items in local MDSplus trees and then add processed data to the local catalog.

Research Organization:
Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States). Plasma Science and Fusion Center
Sponsoring Organization:
USDOE Office of Science (SC), Fusion Energy Sciences (FES)
Grant/Contract Number:
SC0012470
OSTI ID:
1897960
Alternate ID(s):
OSTI ID: 1399141
Journal Information:
Fusion Engineering and Design, Vol. 112; ISSN 0920-3796
Publisher:
ElsevierCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

References (4)

The Java interface of MDSplus: towards a unified approach for local and remote data access journal August 2000
The MDSplus data acquisition system, current status and future directions journal January 1999
A metadata catalog for organization and systemization of fusion simulation data journal December 2012
Data analysis software tools for enhanced collaboration at the DIII–D National Fusion Facility journal August 2000

Similar Records

Data Catalog Project - A Browsable, Searchable, Metadata System
Dataset · Sat Apr 09 00:00:00 EDT 2022 · OSTI ID:1897960

MDSplus data acquisition system
Journal Article · Wed Jan 01 00:00:00 EST 1997 · Review of Scientific Instruments · OSTI ID:1897960

Reducing Information Overload in Large Seismic Data Sets
Conference · Wed Aug 02 00:00:00 EDT 2000 · OSTI ID:1897960