Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

NEPATEC v2.0: Standardized Metadata and Text Corpus of National Environmental Policy Act Documents

Technical Report ·
DOI:https://doi.org/10.2172/2584716· OSTI ID:2584716

The National Environmental Policy Act of 1969, as amended (NEPA), is a major environmental law in the United States, requiring Federal agencies to consider and document potential environmental impacts before deciding on a proposed action. Modernization of NEPA and permitting processes faces significant challenges due to the lack of standardized formats and interoperable systems for organizing and sharing NEPA-related information across agencies. Much of the information gathered during NEPA reviews is written into documents such as categorical exclusions, environmental assessments, and environmental impact statements, then filed in predominately independent agency file stores that may or may not be publicly accessible. The application of metadata and data standards, such as those recommended by the Council on Environmental Quality (CEQ), to NEPA documents offers a shared vocabulary and structure for key entities like projects, processes, and documents that can streamline information exchange and enhance collaboration across systems. In this work, we publicly release NEPATEC2.0, an expanded corpus of NEPA documents with associated metadata. NEPATEC2.0 encompasses approximately 120,000 documents from 60,000 projects prepared by more than 60 different agencies. Modeled to align with CEQ metadata standards, NEPATEC2.0 promotes consistency in environmental reviews and supports the ongoing effort to modernize permitting technologies by facilitating more transparent, efficient, and data-driven decision-making. Importantly, NEPATEC2.0 demonstrates the possibilities and limitations of large language model-based prompting to extract information from NEPA documents at scale.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
2584716
Report Number(s):
PNNL--38163
Country of Publication:
United States
Language:
English

Similar Records

NEPATEC2.0: NEPA Text Corpus v2.0
Dataset · Tue Sep 30 00:00:00 EDT 2025 · OSTI ID:2997034

Environmental permitting: expediting the NEPA process
Journal Article · · Natur. Res. Environ.; (United States) · OSTI ID:6519819

Sandia National Laboratories Ecosystem for Open Science: Metadata Schema v0.2 Description.
Technical Report · Thu Sep 17 00:00:00 EDT 2020 · OSTI ID:1777073