DOE Data: Would We, Could We...?

by Jannean Elliott on Sun, January 31, 2010

I can’t remember how it went now, but as a child I skipped rope to a rhyme that included “would I, could I” somewhere in it.  Recently questions were asked about OSTI’s involvement with scientific research data.  Is OSTI planning to become a repository for numeric data?  Are we going to issue Digital Object Identifiers (DOIs) for datasets, and would we be telling people how to manage their data?  For some reason, the questions triggered the memory of that old refrain, but now I was thinking from an OSTI perspective, “would we, could we…?”

Fortunately, I’m much clearer about OSTI’s answer to those questions than I am about the conclusion of that old rhyme.  In order, the answers are a simple no, maybe, and no.

I’m in a position to know these answers because of my tasks here at OSTI.  I work with the Scientific and Technical Information Program (STIP) that handles policies and processes for information submissions to OSTI.  I’m also the product manager for the DOE Data Explorer and an OSTI point of contact for a related, ongoing STTR grant.

If you wonder why anyone would think to ask if OSTI has plans to begin taking in data, the question is, no doubt, triggered by the revision currently underway of the STI directive DOE O 241.1A.  That directive basically says that an announcement notice (citation/bibliographic record) for any scientific and technical information resulting from DOE-funded R&D must be submitted to OSTI.  For technical reports and, when possible, for other document types, that announcement notice contains a URL that links to the PDF document.  OSTI’s databases allow users to search both the citation in the database as well as the full text of the document, whether it resides at OSTI or elsewhere. 

Today’s researchers, however, want more than documents; they want to see for themselves computer simulations, images from the heart of an accelerator, numeric datasets – all the wealth of information forms and formats that today’s technologies have made possible.  Their wants are coming from a “sea change” in the nature of information, and DOE O 241.1B, the draft revision, will recognize that fact.  It will instruct DOE organizations to submit an announcement notice for the scientific multimedia and data they generate and make available.  These items (especially data!) must be housed at the originating site, the appropriate data center, a DOE user facility, or wherever current arrangements dictate.  Only an announcement notice with a link in it can actually be submitted to OSTI.  The reason for this is simple:  OSTI does not have the capability to store all the multimedia and data currently being generated across the DOE Complex and, with today’s seamless search technologies, really doesn’t need to.

Now, what about DOIs for data?  For several years we have been asking DOE submitters to send us DOIs, if they have them, for technical reports or journal articles.  If they don’t have DOIs already, we obtain them on behalf of the submitter through our membership in CrossRef.  We certainly envision the day when a technical report citation will include DOIs for the specific datasets used by the author.  The reader will click on a DOI and link directly to one of those datasets, whether it resides at a laboratory, a data center, on the web page of a grant-holding university, etc.    

The DOE-issued STTR grant I mentioned earlier is studying issues related to assigning DOIs to data and to correlating specific data to specific reports.  OSTI will be very interested in their results.  We hope, eventually, to help DOE realize the goal of DOIs for data by assisting the different types of data holders in ways that will work for their organizations.

Finally, does OSTI intend to tell people how to manage the data they generate for DOE?  Absolutely not!  Program managers who fund R&D and identify desired outcomes for these projects also have a stake in how the information is managed and preserved.  There are specialists in the labs and user facilities and expert staffs in the DOE Data Centers to manage data.  But we do want to collaborate with them and facilitate the process that will allow non-text information to be represented in our databases through the submitted metadata.

Do you search for scientific information in OSTI’s databases?  Your search habits will not need to change, but, at some point in the future, your search is going to give you a wider, deeper picture of what’s out there in DOE.  You’ll be able to find not only documents, but also the related data and multimedia that answer the questions behind your search.

Jannean Elliott


