Harvesting

Harvesting is one of the fully automated submittal mechanisms for submitting AN 241.1 metadata.  OSTI began offering this option in 2003 but is now working with the currently harvesting sites to transition them to the AN 241.1 Web Service.  When all of the sites have switched over, Harvesting will be discontinued.

The Harvesting sites have a bibliographic database supporting their STI document review/approval/release process, and they  post full text documents on their own web servers.  All STI reported via Harvesting is unclassified with unlimited access and should be fully accessible on the Web to the public and to public search engines such as Google.

OSTI sends out a weekly, automated query with a date range that is basically asking, “What records do you have in your database that are either new or that have been updated since the last run?”  The site’s programmed “script,” which resides at a URL on the site’s external server, then, in turn, queries the review/approval database behind the site’s firewall and generates the reply as an XML output file “on the fly.”  This is the file that OSTI receives, parses according to a customized mapping, and deposits into E-Link.  All of this typically happens during night or very early morning hours.

A confirmation email is automatically returned to the site from OSTI for each “run” and informs the site as to which records harvested correctly and which ones did not.  An error message explains the reason for any record that may not have successfully loaded.

 

Reference:

  1. Required and Optional Metadata for Harvesting
  2. Editing/Updating Harvested Records