OSTI has been making government R&D results open and transparent since 1947
The road has been paved for the Department of Energy (DOE) to quickly implement public access to scholarly publications resulting from DOE funding. Three inter-related building blocks are firmly in place: (1) the DOE Scientific and Technical Information Program (STIP) is as strong as ever; (2) submissions of DOE research and development (R&D) results to the Office of Scientific and Technical Information (OSTI) are at all-time highs; and (3) E-Link has been redesigned and made easier to navigate. All this is to say that a solid foundation has been laid for the upcoming DOE public access portal, the web-based tool that will make scholarly scientific publications resulting from DOE research funding publicly accessible and searchable at no charge to readers (read more).
OSTI spearheads the DOE STIP, a collaboration of STI managers and technical information officers from across DOE working to ensure that the results of DOE-funded R&D are identified, collected, disseminated and preserved. OSTI also devised and manages the corporate E-Link submission system, a tool long used by Departmental organizations and researchers at universities to submit metadata and full text for technical reports, conference papers and other forms of STI. Streamlining E-Link helps ensure that DOE's multi-billion dollar annual investment in research results is preserved and available, as appropriate, on SciTech Connect. SciTech Connect provides free and convenient access to full text and bibliographic DOE R&D results.
The DOE STIP network includes STI liaisons from DOE field, site and procurement offices, national laboratories and research facilities. This network was bolstered in the spring of 2012 when seven DOE program offices obliged OSTI’s request to designate representatives to the DOE STIP. OSTI sought to broaden Departmental representation in STIP in order to help improve the availability of STI.
The STIP expansion, the committed efforts of all STIP representatives and the renewed recognition that dissemination of STI is an integral part of the scientific process together have yielded impressive results. OSTI took in a significantly higher volume of scientific and technical information (STI) in the fiscal year that ended September 30. STIP submissions were more than 50 percent higher at the end of FY13 than FY12, totaling more than 31,000 items representing a wide range of basic and applied research from nanotechnology to small modular reactors.
On September 23, OSTI deployed a redesign of E-Link, making it easier to use and navigate, especially for university and other researchers in the grantee community. E-Link also now supports ORCID (see related story), which enables OSTI to improve the way thousands of author names in its collections are parsed and maintained.
Through the use of E-Link, OSTI acquires metadata and electronic documents or links to documents. R&D information has historically been in textual formats – reports, conference papers, and journal articles – but has recently expanded to include vast datasets (from simulation and modeling, for example) and multimedia such as video, audio and visualization. The E-Link tool is also used to acquire and process multimedia-based R&D information and metadata related to numeric datasets. The metadata for text typically includes a link to full-text documents residing in institutional repositories, or the electronic full text is submitted to OSTI, along with the metadata.
Based on metadata fields and any access limitation, OSTI then organizes and processes acquired information into a number of publicly accessible information products for the search and retrieval of metadata including links to free full-text documents.
This is how the DOE public access portal will work: metadata is centralized, and full text is mostly decentralized but nevertheless accessible seamlessly.
OSTI is preparing this gateway in response to a February 22, 2013, White House Office of Science and Technology Policy directive, “Increasing Access to the Results of Federally Funded Scientific Research.” The OSTP memorandum called on federal science agencies to develop and implement plans to provide public access to the results of research they fund within a year of publication.
ORCID stands for Open Researcher and Contributor ID. The ORCID initiative was launched in 2010 to provide a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers. In other words, ORCID addresses the problem of finding the works of a specific author when there are multiple authors with the same name or multiple variations of the author’s name. ORCID is bringing together universities, publishers, funding bodies and other organizations in the scholarly communications space to create a global, open registry of unique, persistent identifiers for individual researchers. Learn more in this OSTIblog article, “Name Ambiguity and ORCID.”
OSTI officially joined ORCID in May 2013. The first step in incorporating ORCID numbers into OSTI products has been the retooling of the submission process for DOE-funded STI to include authors’ ORCID numbers. At this point, there are a limited number of authors with ORCID numbers, but their numbers will grow now that the E-Link submission process encourages their use. Here is an example of a SciTech Connect citation in which the author has an ORCID number.
SciTech Connect is OSTI's state-of-the-art search tool for making technical reports and bibliographic records publicly accessible. SciTech Connect contains far more than technical reports. For example, close to 55 percent of SciTech Connect citations are for journal articles.
Total number of SciTech Connect journal article citations: 2,523,132
Earlier this year, the DOE Data Explorer (DDE), a tool for locating DOE data and non-text information, received an upgrade with a new design and expanded search functionalities. The “new” DDE has the “look and feel” of SciTech Connect. Like SciTech Connect, DDE automatically breaks down the results of the search into groupings that allow you to shortcut through a long list of citations and go directly to the subset of your choice. In DDE, the groupings are based on the types of data and non-text items that are retrieved by your search term. Search using the word “solar,” for example, and you will instantly see that the 95 items retrieved break down into 61 collections of numeric data, 17 collections of multimedia … and all the way down to one collection of computer animations or simulations that are related to solar energy or solar technology.
The content of the database has also been expanded. The “old” DDE contained records prepared by OSTI staff; these records identified collections of data and non-text information. The “new” DDE contains these and will continue to add more, but now contains records submitted by the owners or holders of data in DOE. These records are sent to OSTI to announce an individual dataset or datastream found within a collection and to register the data for a Digital Object Identifier (DOI). Now, both the dataset records and the collection citations are an integrated part of the DDE’s content.
A DOI is a permanent, electronic identifier assigned to individual documents or datasets that gives the content more stable linking, as well as aids in citation, discovery and retrieval of R&D results and scholarly publications. A new page on the DDE website explains OSTI’s Data ID Service. This service assigns unique Digital Object Identifiers (DOIs) to datasets and registers them with DataCite for international and permanent access.
DDE’s new search engine still offers easy basic and advanced searching. It now also supports Boolean searches, publication date range searches and searches that can be limited to just datasets (records with DOIs) or just collections (no DOIs). All results can now be sorted by relevance or alphabetically by title.
One of the most popular features for DDE users has always been the browsing capability. All of the browsing options in the previous version of DDE remain, but enhancements are here, too. For example, browsing by sponsor/funding organization now allows you to quickly separate collections funded totally by DOE from those sponsored by a combination of DOE and other entities.
Customizable features now include the ability to download retrieved records into an Excel spreadsheet format or create an account to store your searches in a personal “library.” Also new is the ability to log on and choose how you wish to view DDE’s search results.
DataCite is an international organization focused on the issues of data visibility, ease of data citation in scholarly publications, data preservation and future re-use, and data access and retrievability. OSTI became a member of and an allocating agent for DataCite in 2011. OSTI built the OSTI Data ID Service to handle its DataCite responsibilities and helps DOE organizations submit the metadata needed for OSTI to index datasets, assign Digital Object Identifiers (DOIs) to them, and register those datasets with DataCite. OSTI is the only U.S. federal agency with this authority.
OSTI is currently working, in various stages of planning, testing and production, to register data for the Atmospheric Radiation Measurement Program Data Center and Archive at Oak Ridge National Laboratory (ORNL), the Measurement and Instrumentation Data Center at the National Renewable Energy Laboratory, the Coherent X-ray Imaging Data Bank at Lawrence Berkeley National Laboratory, the National Nuclear Data Center at Brookhaven National Laboratory and the DOE Geothermal Data Repository. The many hundreds of datasets at these centers represent and provide pathways to millions of individual data files.
OSTI managers Mark Martin and Jannean Elliott traveled to the National Academy of Sciences in Washington D.C. for the DataCite summer meeting on September 19–20. The sessions were attended by nearly 200 participants from countries around the world and from government, academic, publishing and scientific communities. OSTI invited its newest data “client,” the Oak Ridge Leadership Computing Facility at ORNL, to attend. Sudharshan Vazhkudai and Terry Jones accepted, and Jones gave a talk on the opportunities that data registration and citation will bring to those science disciplines using the power of supercomputing as a necessary tool.
In his presentation, Jones noted that High-Performance Computing yields breakthroughs in science that were not possible with earlier computers. But he asked an interesting question: “What further breakthroughs might science be missing out on because of the limited communication abilities of traditional papers?” Today’s scientists (oftentimes supercomputer users) are unnecessarily constrained when they are limited to bibliographic papers to convey scientific expression. Indeed, it is often difficult to simply replicate a present-day experiment with enough detail to support the content of a paper. For scientists to be able to replicate today’s complex experiments, they need more. Through making it possible for scientists to easily “publish” their supercomputer files (e.g., data, the sourcecode to their application, and so forth), DOIs provide an essential mechanism for scientific progress. They also help data centers meet the new emphasis on open availability of government data. Jones also explained specific benefits of DOIs for datasets to the scientific user and to the sponsor funding the research.
Jones noted that the Oak Ridge Leadership Facility sees OSTI and its data registration partnership with DataCite as providing the support and the service needed to enable the supercomputing community to do more for science through better data management. “OSTI has been an indispensable partner in our efforts to make supercomputer-produced scientific data more accessible,” said Jones. “Our ability to use OSTI's services and expertise has been critical to how we are fashioning our DOI infrastructure as an enabling component for citing, sharing and supporting future re-use of data.”
In July, The Manhattan Project: Resources, a web-based, joint collaboration between the DOE Office of Classification and Office of History and Heritage Resources was launched. The site is designed to disseminate information and documentation on the Manhattan Project to a broad audience including scholars, students and the general public. OSTI is hosting this information as part of the OpenNet website. Manhattan Project Resources consists of two parts: 1) a multi-page, easy to read and navigate Manhattan Project: An Interactive History providing a comprehensive overview of the Manhattan Project, and 2) the full-text, declassified, multi-volume Manhattan District History commissioned by General Leslie Groves in late 1944. The new site brings together an enormous amount of material, much of it never before released.
OpenNet is an example of OSTI’s capability to provide customized information tools and services for individual DOE offices and non-DOE government entities on a cost-reimbursable basis. These services are provided under the authority of the Economy Act (31 U.S.C. 1535-36).
OSTI offers expertise in a range of technical areas, including:
OSTI develops and maintains subject-specific databases, web portals and websites; manages information systems; and provides electronic publishing and creative services to help DOE program offices, other government agencies and international organizations better manage their information resources.
Lorrie Apple Johnson is OSTI’s program liaison for the DOE Offices of Science, Environmental Management and ARPA-E. In that role, she works extensively with the Department’s research programs to facilitate the acquisition, management and dissemination of scientific and technical information through customized products and services. Lorrie also leads OSTI’s information sciences and bibliometrics services, which support DOE programs’ efforts to analyze and measure their scholarly literature output. She serves as the product manager for both ScienceCinema and WorldWideScience.org, specializing in the implementation and use of innovative information technologies from public-private partnerships, such as audio-indexing, multilingual machine translations, and federated search. Lorrie has a Master of Science degree in Library and Information Sciences from the University of Tennessee, and dual Bachelor of Science degrees in Biochemistry and Zoology from North Carolina State University.
|DOE Data Explorer||89||95|
|DOE Green Energy||96||96|
|DOE R&D Accomplishments||30||30|
|DOE Technology Transfer||40||40|
|Energy Science and Technology Software Center||6||44|
|Science Conference Proceedings||200||119,384|
|Science Journals Connector||193||653,841|
|Science Open Access Journals||200||63,339|
If you do a lot of searching on one of OSTI’s federated search tools, then you should be aware of the “Source Status” feature. “Source Status” is available on ScienceAccelerator, E-print Network, Science Conference Proceedings, National Library of Energy, Science.gov and WorldWideScience.org. You will find “Source Status” in the upper right-hand corner on the search results page. This gives you a listing of all the databases being searched by the federated search tool and gives a summary of the results. In the first column, a green checkmark means the search of the database was successfully completed. In rare cases, the search of an individual source is not completed. A red “X” means the connection to the database was not made. A clock face with a small red “x” indicates that the search timed out before a result was received. In those cases where a search is not completed, it is usually a temporary issue at one of the decentralized sources outside OSTI. Checking “Source Status” will let you know if you are missing results from key databases. If so, you may want to redo your search.
The next two columns in “Source Status” give the number of citations for your search terms from each database. These columns are titled “Results” and “Total.” The “Results” column is the number of citations from the database that are included in your federated search results. The “Total” column gives the total citations your search found in the database. Often, these numbers will be the same. However, in many cases, a search retrieves a large number of citations from a database and only the top ranked are included in your results. If you check the “Source Status” and see databases with a significant difference between the “Results” and the “Total” available results, you might want to go directly to that database and redo your search so you don’t miss significant citations. The Advanced Search page has a list of all sources searched and each title is linked to the source, making it easy to search a source that has a high “Total Search” result.
For 250 years, the use of Bayesian inference methods has consistently been an important tool in estimating probabilities, given knowledge of certain related probabilities. DOE researchers are incorporating Bayesian inference in research areas such as crystallography, medical diagnostic and astronomical imaging, threat detection, groundwater transport modeling, building energy research, climate modeling and more. A layman’s overview of Bayesian inference and specific uses and conceptual ramifications of Bayes’ theorem in DOE research endeavors is available in the OSTI Collections – Bayesian Inference by Dr. William Watson, physicist, of OSTI’s staff.