
Figure 257578: DataCite newsletter6
DataCite is an international organization focused on the issues of data visibility, ease of data citation in scholarly publications, data preservation and future re-use, and data access and retrievability. OSTI became a member of and an allocating agent for DataCite in 2011. OSTI built the OSTI Data ID Service to handle its DataCite responsibilities and helps DOE organizations submit the metadata needed for OSTI to index datasets, assign Digital Object Identifiers (DOIs) to them, and register those datasets with DataCite. OSTI is the only U.S. federal agency with this authority.
OSTI is currently working, in various stages of planning, testing and production, to register data for the Atmospheric Radiation Measurement Program Data Center and Archive at Oak Ridge National Laboratory (ORNL), the Measurement and Instrumentation Data Center at the National Renewable Energy Laboratory, the Coherent X-ray Imaging Data Bank at Lawrence Berkeley National Laboratory, the National Nuclear Data Center at Brookhaven National Laboratory and the DOE Geothermal Data Repository. The many hundreds of datasets at these centers represent and provide pathways to millions of individual data files.
OSTI managers Mark Martin and Jannean Elliott traveled to the National Academy of Sciences in Washington D.C. for the DataCite summer meeting on September 19-20. The sessions were attended by nearly 200 participants from countries around the world and from government, academic, publishing and scientific communities. OSTI invited its newest data "client," the Oak Ridge Leadership Computing Facility at ORNL, to attend. Sudharshan Vazhkudai and Terry Jones accepted, and Jones gave a talk on the opportunities that data registration and citation will bring to those science disciplines using the power of supercomputing as a necessary tool.
In his presentation, Jones noted that High-Performance Computing yields breakthroughs in science that were not possible with earlier computers. But he asked an interesting question: "What further breakthroughs might science be missing out on because of the limited communication abilities of traditional papers?" Today's scientists (oftentimes supercomputer users) are unnecessarily constrained when they are limited to bibliographic papers to convey scientific expression. Indeed, it is often difficult to simply replicate a present-day experiment with enough detail to support the content of a paper. For scientists to be able to replicate today's complex experiments, they need more. Through making it possible for scientists to easily "publish" their supercomputer files (e.g., data, the sourcecode to their application, and so forth), DOIs provide an essential mechanism for scientific progress. They also help data centers meet the new emphasis on open availability of government data. Jones also explained specific benefits of DOIs for datasets to the scientific user and to the sponsor funding the research.
Jones noted that the Oak Ridge Leadership Facility sees OSTI and its data registration partnership with DataCite as providing the support and the service needed to enable the supercomputing community to do more for science through better data management. "OSTI has been an indispensable partner in our efforts to make supercomputer-produced scientific data more accessible," said Jones. "Our ability to use OSTI's services and expertise has been critical to how we are fashioning our DOI infrastructure as an enabling component for citing, sharing and supporting future re-use of data."