INSTRUCTIONS FOR ANNOUNCEMENT OF U.S. DEPARTMENT OF ENERGY (DOE) PUBLICLY
AVAILABLE SCIENTIFIC RESEARCH DATASETS
Purpose:
Announcement Notice (AN) 241.6 provides to the U.S. Department of Energy (DOE) Office
of Scientific and Technical Information (OSTI) metadata needed to identify/announce
publicly available datasets resulting from work funded by the U.S. Department of Energy
(DOE) or performed in DOE facilities. The information allows OSTI to assign Digital Object
Identifiers (DOI) to datasets and register them with DataCite.
This free, value-added step facilitates visibility, helps ensure long-term preservation, and
supports better linkage between DOE's published research results and the underlying data.
See the page about the DOE Data ID Service for information about the process
by which a DOI is assigned to a dataset and the benefits that result. Then, use this page of
instructions to help you as you fill in the AN 241.6 or take a look at the documentation for the AN 2416 Web Service/API
if you are interested in a more automated submittal method.
Who uses this Notice:
DOE, DOE Major Site/Facility Management contractors, multi-program and single-program
laboratories, other DOE facilities, and DOE grantees/financial assistance recipients may
complete AN 241.6 and submit it with a URL for the publicly available location of the data.
Contact Information
To contact the DOE Data ID Service, use email DOEDataID@osti.gov. For help in completing this Announcement Notice or assistance with E-Link, you may call 865-576-4070 or email elink_Helpdesk@osti.gov.
AN 241.6 Metadata Details and Requirements
*An asterisk indicates required information.
'Tab' between data fields. The 'enter' key assumes you are attempting to submit the notice before entering all required information.
To begin:
Select from the Site Code picklist the site or project for which you are creating the metadata record. It is a required field.
Part I: STI Product Description
Dataset Type
Note that "Dataset" is automatically defaulted into the record as the product type whenever you use the AN 241.6. "Dataset Type", however, allows you to be more specific about the dataset. Select one choice from the drop-down list that best describes the dataset's main or most important content.
- Animations/Simulations - Animations and simulations resulting from runs of computer models or similar software.
- Figures/Plots - A dataset consisting mainly of data diagrams, graphs and charts, diagrams or schematic drawings.
- Genome/Genetics Data - Information that is numeric or alpha-numeric in nature (such as gene sequences) or that is a specialized mix of text and non-text information conveying results of genetics/genome research
- Instrument - Use this option to relate datasets which derive from a specific DOE instrument at one of the laboratories or user facilities. This creates a central Instrument record with a DOI to link out to related datasets.
- Interactive Data Map(s) - A non-static interface and the GIS data and/or shape files that generate it.
- Multimedia - An example of a multimedia dataset might be a video of an experiment in progress, where the camera monitors change over a number of hours.
- Numeric Data - Data primarily expressed with numbers; other content is secondary and supporting.
- Specialized Mix - This "type" may be used to indicate a dataset made up of content that doesn't fit into one of the other "type" categories. The content of a "specialized mix" dataset could have some of everything in this list, for example, but is clearly focused on data and does not have a "format", such as a technical report that is focused on data would.
- Still Images or Photos - A collection of images or photographs that are produced by a scientific instrument or that convey scientific results of experiments. Scientific images that might constitute a dataset could be images of cells or molecules that are typically taken with electron microscopes, 3-D structures of proteins or nanomaterials, images captured during an accelerator run, images from astronomy, etc.
Dataset Title
Enter the title given to the data product itself. To aid retrievability and clarity, include part, version, and similar information. Example: Data Package for Year One Monitoring of Dropaway Sinkhole, Creekside, Colorado, 2016, version 2.
Publication/Issue Date
Provide the date when the information product was published or issued, either in format mm/dd/yyyy
(example: 04/17/2011), or in format yyyy (example: 1995). If you use the yyyy format, you may also
select a Time Period from the drop-down list, if known.
Or Text Date
Provide date as Spring 2011, January 2011, etc.
Author(s) And Contributor(s)
In this section of the AN 241.6 you may enter the names of authors/dataset creators, the names of other people who contributed in some way to the data being published, and names of organizations that also contributed. Simply select one of the buttons with a "plus" sign. That button will open a dialog box where you will be able to enter the pertinent information.
The "Manage Authors" pop-up allows you to enter first, middle, and last names in their natural order. The email field is for administrative purposes only and will never display in public databases. You can also enter the author's ORCID, choose or type his/her affiliation, and click the radio button if the name you've just entered is the primary author.
The url portion is added automatically by the system to make the ORCID an active link in the output databases.
When finished, be sure to click the "Save and Close" button. If you have multiple authors, select "Add Author" again and continue doing so until all your authors' names display correctly. When each author name displays in the interface, three icons will also become available for name: a "move" icon (which allows you to grab and move a name to the bottom of a list of names so you can delete that one), an "edit" icon, and a "delete" icon.
Individual contributors and/or contributing organizations are added, one at a time in the same manner. One major difference, however, is that the nature of their contribution can be characterized by selecting one of the controlled vocabulary terms in the "Contributor Type" picklist.
Please note that names of Collaborations need to be entered into the separate field called Contributor Organizations/Collaborations.
The DOI Infix
The DOI Infix field allows you to provide a character string in your submitted record to be added to the DOI that OSTI assigns. The value put in this field can be letters, numbers, or a combination. The infix value might be a project name, a category type, a geographic region, or anything that will add "intelligence," meaningful to users.
E-Link inserts the infix value provided between the unique prefix for the client and the unique OSTI ID that forms the suffix.
Format: DOI prefix/DOI infix/DOI suffix
Origin: Prefix from DataCite/infix from submitter/suffix from OSTI ID
Example: 10.19597/myprojectname/1105143
STI Product Identifiers
Dataset Product Number(s)
An identifying number that has been assigned to the dataset by either the originating/submitting organization
or by the organization currently hosting the data. If two different organizations have assigned different numbers
to the dataset, both are listed here. They should be separated with a semicolon and a space. If no identifying
product number exists, the word "None" may be entered in this field.
DOE Contract/Award Number(s)
Enter the DOE contract number under which the work was funded. If the dataset is a result of a joint effort between
two or more DOE Site/Facility Management Contractors, etc., additional DOE contract numbers may be entered. The "DE"
should not be included as a part of the number. Multiple numbers are separated with a semicolon and a space. When
more than one number is entered, the first number is considered the primary number.
Other Identifying Number(s)
Examples of other identifying numbers that submitters may want to include are:
- The number assigned by arXiv.org to an author's posting of a paper (arXiv: 1501.00003)
- The accession number from the submitting site's database (45029 or any format)
- Any identifying number that may have meaning or retrieval utility to a particular segment of the anticipated user population (any format)
Note that this field is not for related identifiers/DOIs, where those DOIs must be accompanied by a controlled vocabulary term that "explains" the relationship
between the DOI you enter and the dataset you are submitting. Other identifying numbers are truly "other", in that they do not typically fit more specific
identifier fields available for input, i.e. fields such as product/report number, contract number, R&D project IDs, etc. Note also that multiple identifiers
may be input here. A semicolon and a space must be used to separate each identifier from the next one following it.
Award DOI
Enter the Award Digital Object Identifier (DOI) under which work or time was provided. An Award DOI is assigned to awards,
contracts, equipment, facilities, grants, prizes, salary awards, and/or training grants. If the research object was produced
under multiple awards given by multiple organizations, additional Award DOIs may be entered. Multiple Award DOIs are to be
entered using the + to create a new entry. DOIs follow the format: 10.XXXX/XXXXX.
Note: Award DOIs are different than DOIs assigned to the research object currently being announced (submitted) to OSTI. DOIs
for the research object may have been assigned by publishers or a repository. If a DOI for a research object has not already
been assigned from another source, OSTI may assign a DOI.
Geolocation Data
The geolocation fields allow you to enter geographic place names and mapping coordinates for your dataset. Select the "Add Geolocation" button to open the "Add/Edit Geolocation" dialog box. The "Place Name" field is optional and accepts free text. You may wish to name the country, city, and state, for example where your raw data were collected. You may Save if that's all you wish to do.
To enter specific coordinates, use the picklist called "Geolocation Type." You may enter a latitude and longitude pair, a bounding box, or a polygon.
If you want to add specific mapping coordinates to supplement the Place Name, use the picklist called "Geolocation Type." Geolocation Point opens the latitude and longitude fields; they must both be used to create the correct coordinates pair.
If you choose Geolocation Box, the four fields you must enter will open. The Geolocation Polygon selection will open as many fields as you need to define your specific shape. Each point of a polygon is defined by a latitude-longitude pair. The last point (pair that you enter) should be the same as the first. Note that four "points" or paired values are required; that count includes the last one being the same as the first one. Keep adding another set of points/pairs until you have all that the shape of your individual polygon needs.
Please use the standard WGS 84 (World Geodetic System) recommended by DataCite to express coordinates. WGS 84 specifies "Use only decimal numbers for coordinates. Longitudes are -180 to 180 (0 is Greenwich, negative numbers are west, positive numbers are east), Latitudes are -90 to 90 (0 is the equator; negative numbers are south, positive numbers north). Here is an example of the values that you might enter to create a bounding box:
North Bound Latitude: 44.9667 | East Bound Longitude: -63.8 |
South Bound Latitude: 44.7167 | West Bound Longitude: -64.2 |
To edit or delete any entry that you've put into the Geolocation fields, simply click on the horizontal line on the main page of the AN 241.6 where that entry displays. It will open the dialog box that will allow you to change or to delete that one entry on that one line.
Originating Research Organization
Select the name of the organization that performed the research or issued the dataset from the drop-down list.
More than one organization may be selected. You may also type in the name of the
Originating Research Organization, if you do not see it in the picklist.
Select or list the primary organization first and separate multiple entries with a semicolon and a space.
Collaboration Name(s)
The official name of a Research Collaboration, if applicable, should be entered in this field, not in the author or the Contributing Org field.
This is a free text field and can hold multiple collaboration names. Separate multiple collaboration names with a semicolon followed by a space.
The name of the collaboration should always be submitted with its acronym. The spelled-out name may or may not be familiar to searchers;
submitter discretion may be used in deciding whether to also include the spelled-out name.
- Example with names spelled out: Compact Muon Solenoid (CMS) Collaboration; Heartland Alliance for Regional Transmission (HART)
- Example with acronym only: Belle Collaboration; MicroBooNE Collaboration; MINERvA Collaboration
Availability
Provide the name of any office or organization that can offer additional help in obtaining or utilizing this dataset.
Country of Publication
This field has a default value. The default value is "United States." If the country of publication is not the United States, select the country of publication from the drop-down list.
Language
This field has a default value. The default value is "English." If the language is not English, select the language in which the information product is written from the drop-down list.
Subject Categories
Select one or more categories from the drop-down list. Select the primary one first. A list
of subject categories and their descriptions is available at
https://www.osti.gov/elink/authorities.jsp.
If no subject category is provided by the originating organization, the Office of Scientific and
Technical Information may generate the appropriate categories.
Description/Abstract
Provide a clear, concise summary of the content of the dataset, as well as specialized parameters
that describe the data. Specialized parameters may include a date range during which information
was taken (such as May 01, 2002 - December 31, 2002), information such as well depth ranges,
temperature ranges, etc. The abstract length should be no more than 5,000 characters.
Keywords
Provide terms that describe the content of the dataset. More than one term may be entered;
separate multiple terms with a semicolon and a space. If keywords are not provided by the
originating organization, the Office of Scientific and Technical Information may generate them.
Part II. Technical Specifications
Location Information
URL where dataset is posted for access
Provide the URL that leads to an HTML "landing page" (information page) that provides context and usage information
for the dataset. The landing page must include a direct link to the dataset and/or to its component files.
Provide a complete unique URL (Uniform Resource Locator) address sufficient to
access the landing page.
Digital Object Identifier (if already assigned)
Provide the DOI only if an organization other than OSTI has assigned it.
If the dataset does not already have a DOI, one will be assigned to it by
the
DOE Data ID Service.
[Please be aware that registering a dataset for a DOI includes a commitment on the part of the
author or submitter that the dataset will be maintained indefinitely for public access.
DataCite recommends that datasets be placed in the care of a data center or online repository
prior to registration.]
Technical Specifications
Dataset File Extension
Please provide the file extension of the dataset. The content of the dataset will not be
indexed by OSTI but knowing the type of file posted will be important to the users that
search our databases. Some common file extensions are .txt, .csv, .ps, etc.
Software needed to utilize dataset (if applicable)
Specialized software tools are often developed to allow a user to manipulate data in various ways.
If these tools are available for the user but do not have to be used with the data, they do not
need to be listed. However, if there is a piece of software without which a user cannot open, see,
or use the dataset, that software should be noted in this field.
Dataset Size
Indicate how many individual data files are included in the dataset being announced, or if the dataset
consists primarily of images, note the approximate number of images.
You may also indicate size in megabytes, and you may indicate whether the dataset is complete or will continue to have files added to it.