Required and Optional Metadata for Harvesting (OSTI-to-Site)

Print pagePrint page
Required Metadata

Records with missing required metadata will not be accepted.

Metadata Standardized XML Tag Names for Output File Business Rules/Required Values, etc.
Access Limitation System defaults to UNL Information must be unclassified unlimited for harvested products.
Site Accession Number <accession_no> Unique site-assigned number that OSTI uses to recognize previously harvested records that are being updated.
STI Product Type <product_type> Product type value must be a code:
AR – S&T Accomplishment Report
B –   Book/monograph/factsheet
CO – Conference/Event Paper, Presentation, Proceedings
DA – Dataset
JA -   Journal Article
PA -  Patent
PD -  Program Documents
TD -  Thesis/Dissertation
TR -  Technical Report
STI Product Title <title> Required for all STI products
Author/Creator/PI <author> Format is last name, first name, middle initial.  Separate multiple authors with a semicolon and a space.  Note that the presenter/speaker is the “author” in a CO video. If an author’s affiliation is submitted also, it should always be in parentheses and is best handled in a separate field.  See optional metadata list.
Report/Product Number(s) <report_nos> Value may be “None” if necessary.  Separate multiple values with a semicolon and a space.
DOE Contract Number(s) <doe_contract_nos> Separate multiple values with a semicolon and a space.
Originating Research Organization <research_org> Required.  Values/codes in OSTI’s Originating Research Organization Authority are used.  Sites should use customized tags when submitting metadata from a DOE User Facility, and OSTI will concatenate that information into the Origination Research Organization field in the output databases.
Sponsoring Organization (DOE Program Office and Sub-Program Office) <sponsor_org> Values/codes in OSTI’s Sponsoring Organization Authority are used.
Country of Publication <country_publication_code> Values/codes in OSTI’s Country of Publication Authority are used.  OSTI can default this value into your record if the value will always be US.
Language <language> Values/codes in OSTI’s LanguageAuthority are used.  OSTI can default this value into your record if the value will always be English.
Publication Date <publication_date> See Publication Date* note below for allowed formats.

Releasing Official’s Name


Releasing Official’s Contact Information <released_by_phone> <released_by_email  
Released Date <released_date> This date may be included in any format.  It does not flow to the output databases and is not searchable.
Conference Information <conference_information> Name of conference or lecture series or colloquia, etc., then location (city/state or country).  End with dates if a conference or a workshop with specific beginning and ending dates.
Journal Information <journal_name>



All three of these tags need a value in order for OSTI to obtain a DOI from CrossRef.
Publisher Information <publisher_information> Applicable when product type is B or when product type CO is for published proceedings.
Site URL <site_url> Required when product type is AR, TD, TR, and  DA.  It is also used when CO presentations or videos are posted on a non-OSTI website.
Medium <medium> Required when a CO product is audiovisual material.  The value is AV.
* Publication Date values/formats allowed are:
  • mm/dd/yyyy  (This format is required for TR and TD products)
  • yyyy
  • yyyy followed by either month, season, or quarter text as follows:

(a) yyyy and month fully spelled out (2000 April)
(b) yyyy and season fully spelled out (2000 Spring)
(c) yyyy and Quarter (2000 1st Quarter (CY) or (FY)

Optional Metadata

Metadata XML Tags for Output File Business Rules/Values, etc.
Authors’ email addresses <author_emails> Separate multiple authors with a semicolon and a space.
Authors’ Affiliations   Sites may choose tags for a separate field.  Always place parentheses around the affiliations to set it apart them apart from names; the parens with all affiliations in it will be placed at the end of the author field in the output databases.  Affiliations should be listed in the same order as the authors to enable correct correlation.
Availability <availability> This is an organization or location to which requests can be referred (if applicable).  For example, a URL with additional help or contact information about a document could be placed in this field.
Description <abstract>  
Digital Object Identifier <doi> Use if a DOI has already been assigned by the site prior to submission to OSTI.  OSTI submits journal article metadata to CrossRef to obtain a valid, publisher’s DOI and inserts it in JA records.  OSTI also assigns DOIs to technical reports (TR) and to datasets (DA)
Keywords <keywords> Separate multiple values with a semicolon and a space.
Journal Serial Number <journal_serial_id>  
Document’s Related Information <related_doc_info> Applicable for any product type except DA
Other Identifying IDs <other_identifying_nos> Separate multiple values with a semicolon and a space.
Product Size <product_size> Free text values may indicate number of files in a dataset, number of pages in a report, megabyte size, etc. 
Subject Categories <subject_codes> Values/codes in OSTI’s Subject Category Authority are used.  Separate multiple values with a semicolon and a space.  List most relevant fist.
Name of location where dataset resides <dataset_loc_name> Free text field for values such as National Nuclear Data Center (NNDC)
Audiovisual File Format <av_format> Applicable when product type is CO and medium code is AV. 


Last updated: November 13, 2013