skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluation of Data Catalog Software for Hanford Site Environmental Datasets

Technical Report ·
DOI:https://doi.org/10.2172/1832173· OSTI ID:1832173
 [1];  [2]
  1. Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
  2. ; Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Environmental information and data underpin achievement of the U.S. Department of Energy (DOE) Office of Environmental Management (EM) mission at the Hanford Site. The Hanford Environmental Data Management (HEDM) Program is the DOE Richland Operations Office (RL) approach to develop and implement a formal program for managing environmental data and the associated records, materials, and systems at the Hanford Site. The current project, contract, organization, and contractor-specific efforts at managing environmental data sets are insufficient to provide orderly, long-term, site-wide access. A vital element to be created within the HEDM program plan is a catalog of data sources, called the Hanford Environmental Information and Data Index (HEIDI), that will enable long-term access and retrievability for the multiple independent sources of data that might otherwise be difficult to discover. This report compares leading open source and commercial data catalog platforms using criteria to assess the functionality needed to develop the HEIDI catalog of Hanford data sources that connects and exchanges data with established Hanford Local Area Network (HLAN) enterprise information technology systems. Proprietary platforms evaluated included ArcGIS Enterprise Sites, Junar, OpenDataSoft, and Socrata, and non-proprietary platforms included Energy Data eXchange (EDX), Comprehensive Knowledge Archive Network (CKAN), and DKAN (a Drupal-based open data portal based on CKAN). Capabilities supporting data discoverability, retrieval, and archival, as well as metadata standard requirements and integration into the HLAN were rated as either failing to meet requirements (F), meeting requirements (M), or exceeding requirements by delivering additional desired features (E). The lowest rating for any capability area was assigned as the overall rating for the platform. These findings enable DOE-RL and the contractors implementing the HEDM plan to focus on candidate tools likely to meet the requirements for implementing HEIDI. All of the platforms receiving an overall rating of ‘F’ were unable to be deployed on Hanford infrastructure or within dedicated cloud resources. A propriety software-as-a-service (SaaS) model of delivering a data catalog (e.g., found in software such as Junar and OpenDataSoft) favors consistency across customers at the expense of customization and configurable roles that are needed for Hanford work. Hosting data on a shared commercial platform places limits on dataset size (maximum of 240 Mb for OpenDataSoft), a significant limitation for HEIDI implementation. EDX, a government data catalog based on CKAN, received the ‘F’ rating due to an inability to incorporate authentication from HLAN into the system. Among platforms rated ‘M’ or ‘E’, only the Socrata platform had a SaaS delivery model. In contrast to other SaaS platforms, Socrata provided custom roles and gateways that allow local datasets to be incorporated into an online catalog. Socrata also complies with the Federal Risk and Authorization Management Program, a significant benefit for cloud-based management of Hanford data. The other platforms rated ‘M’ or ‘E’, ArcGIS Enterprise Sites, CKAN, and DKAN, provide fully self-hosted options, allowing for greater control and flexibility with the HEIDI catalog. These widely used tools have supportive communities of practice, extensive customization options, and demonstrated deployments that provide evidence that they can meet requirements, often deliver additional desired features, and work well with federal government systems. Completely customized alternatives built on a collection of applications were not evaluated because achieving similar performance to CKAN or DKAN requires substantial resources, especially in the absence of the active communities that have grown to support these tools. ArcGIS Enterprise Sites, Socrata, CKAN, and DKAN were evaluated as strong candidates for successful implementation with HEIDI.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1832173
Report Number(s):
PNNL-31960; DVZ-RPT-066
Country of Publication:
United States
Language:
English

Similar Records

Requirements for Cataloging Hanford Geophysical Datasets
Technical Report · Thu Sep 01 00:00:00 EDT 2022 · OSTI ID:1832173

FY 1997 Hanford telecommunication and informations system user profile, milestone IRM-097-003
Technical Report · Mon Sep 22 00:00:00 EDT 1997 · OSTI ID:1832173

FY 2001 Hanford Waste Management Strategic Plan
Technical Report · Thu Feb 01 00:00:00 EST 2001 · OSTI ID:1832173