U.S. Department of Energy Office of Science Office of Scientific and Technical Information

Speeding Nano Progress Using Information Diffusion


Presented by RL Scott
Walt Warnick, Ph.D.
Director, Office of Scientific and Technical Information
U.S. Department of Energy

Workshop: Informatics Needs for Nanomaterials exit federal site
Oak Ridge, TN
February 8–9, 2007


Three Topics Relating to Nano Info Diffusion

We see three complementary approaches to improve information sharing and awareness
Modeling – It's possible.
Metadata – numeric data, unlike textual data, requires metadata to ensure access
Stewardship – numeric data could follow model of textual STI management


OSTI's Mission

OSTI's Mission. Link to larger image.
To advance science and sustain technological creativity by making R&D findings available and useful to DOE researchers and the American people.

OSTI's creed: Knowledge is contagious – it's our job to make sure everyone “catches” it!


Science Progresses as Knowledge Is Shared

OSTI corollary: If the sharing of knowledge – or knowledge diffusion – is accelerated, scientific progress is accelerated.


Knowledge Diffusion Can Be Measured and Modeled

Researchers will “catch” an idea faster if the “contact rate” between scientists is increased.

From: Power of a Good Idea: Quantitative Modeling of the Spread of Ideas from Epidemiological Models (362-KB PDF)


Models: Knowledge Diffusion

Carbon Nanotubes: This case shows a moderate sensitivity to the contact rate. Doubling the rate speeds up the science by about four years. This is a relatively large community with total authors estimated at tens of thousands.

From: Report for the Office of Scientific and Technical Information: Population Modeling of the Emergence and Development of Scientific Fields (579-KB PDF) by Luis M. A. Bettencourt, et al., October 2006.


Metadata Is a Must (when it comes to numeric data)

Numeric databases and other non-text databases must have metadata to enable searchability and retrieval
Numeric databases must have a steward and be consistent with the proven model of text data centers
Holders of numeric data must be encouraged to harmonize practices
Promoting access, preservation and interoperability


Ensuring Access to Numeric Data
An example: German National Library for Science and Technology (TIB)

A DOI registry is one approach
In cooperation with several World Data Centers, TIB has assigned DOIs for scientific primary data.
Over 400,000 data sets in the field of earth science have been registered; goal is to have a worldwide DOI registration agency for primary data.
The TIB assigns DOIs only for "collections," large data sets, and databases. It does not attempt to treat each data file individually.


Management of Scientific Text Is a Model for Numeric Data

The senior STI managers from 12 U.S. federal agencies form an interagency working group called CENDI.

Each agency has an organization to manage STI (Numeric data would need specialist administrator, or steward)
Defense Technical Information Center (Department of Defense)
Office of Research and Development & Office of Environmental Information (Environmental Protection Agency)
Government Printing Office
NASA Scientific and Technical Information Program
National Agricultural Library (Department of Agriculture)
National Archives and Records Administration
National Library of Education (Department of Education)
National Library of Medicine (Department of Heath and Human Services)
National Science Foundation
National Technical Information Service (Department of Commerce)
Office of Scientific and Technical Information (Department of Energy)
USGS/Biological Resources Discipline (Department of Interior)


Textual Research Results Are Available through Interagency Portal

Provides access to 50 million pages of science information in a single query
A parallel approach could be developed for numeric data


Establishing Ground Rules for Metadata Enables Access to Data
Overcome barriers of organizations
International Portal
Interagency Portal
Agency Portal
Lab or Institute Portal
Nano Portal
data moving forward


Search for Nano Should Reach Text and Numeric Data

A Search for "nanotubes" finds many documents, but numeric data is harder to locate.
Information centers, such as OSTI, ensure seamless access to textual data.
Data repositories, using DOIs and metadata can serve similar role.
Links between publications and the underlying data will enable researchers to locate essential information.
Textual info, e.g., tech reports or journal articles
Numberic data sets


Data Preservation Dilemma

NSB raises the right questions
Many critical science and official collections must be sustained for the foreseeable future
Critical collections:
Community reference data collections (e.g., Protein Data Bank)
Irreplaceable collections (ARM data related to climate change)
Experimental research data (BaBar/other event data)
“…the progress of science and useful arts … depends on the reliable preservation of knowledge and information for generations to come.”
“Preserving Our Digital Heritage”
Library of Congress
No plan for preservation (i.e., no steward) often means that data is lost or damaged.


In Summary: Three Points on Nano Info Diffusion

In Summary: Three Points on Nano Info Diffusion. Link to larger image.
Modeling – It's possible.
Metadata – numeric data, unlike textual data, requires metadata to ensure access
Stewardship – numeric data could follow model of textual STI management