Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Data, model inputs, and analysis scripts associated with a manuscript on stream intermittency controls across spatial scales in Pacific Northwest watersheds

Dataset ·
DOI:https://doi.org/10.15485/3023133· OSTI ID:3023133
NOTE: The manuscript associated with this data package is currently in review. The data may be revised based on reviewer feedback. Upon manuscript acceptance, this data package will be updated with the final dataset and additional metadata. This data package is associated with the manuscript "Hydroclimatic Memory and Watershed Template Shape Stream Intermittency: Multi-scale Attribution Using Process-based Simulation and Explainable ML" by Niroula et al. (2026), submitted to Water Resources Research (WRR). The study investigates the dominant controls on stream intermittency across local, reach, and watershed scales using a coupled process-based simulation and explainable machine-learning framework. Long-term daily simulations from the Advanced Terrestrial Simulator (ATS) were used to generate wetness states and ponded-depth responses over river-corridor cells. These ATS outputs were then aggregated across scales and used to train XGBoost (eXtreme Gradient Boosting) models. SHAP (SHapley Additive exPlanations) was applied to quantify the relative importance of hydroclimatic forcings, watershed template attributes, and antecedent-memory effects in shaping intermittency behavior. The analysis is carried out for three contrasting Pacific Northwest watersheds: Oak Creek (OCW), American River Watershed (ARW), and H.J. Andrews (HJA). Across these testbeds, the package contains ATS-ready watershed inputs, ATS run configuration and selected output files, model-evaluation data products, intermittency-analysis datasets, machine-learning target-feature tables, SHAP outputs, and notebooks used to organize, analyze, and visualize results. At a high level, the package documents a workflow in which ATS provides the physically based simulation backbone and explainable machine learning is used as a post-processing attribution tool. The contents are intended to support interpretation of the manuscript figures and results, provide context for how intermittency metrics were generated at multiple scales, and preserve the key artifacts needed to understand and reuse the analysis workflow. The package contains a high-level directory summary file (`summary.txt`) and four main content folders (1) `evaluation_plots` contains evaluation figures and supporting evaluation datasets; (2) `intermittency_plots` contains intermittency-focused analysis notebook and prepared datasets; (3) `ml-training-and-shap_values_plots` contains ML training inputs, SHAP outputs, and figure-generation notebooks; and (4) `watershed_mesh_and_ats_input` contains ATS model setup materials, forcing inputs, geometry, and selected run files. More specifically, the `evaluation_plots` folder contains the notebook used for ATS evaluation plotting and site-specific evaluation datasets. These include evapotranspiration and water-balance products for three watersheds, as well as an Oak Creek field-measurement discharge file. The `intermittency_plots` folder contains the notebook used for intermittency analysis and the prepared datasets used to analyze intermittent and non-intermittent wetness behavior across the study watersheds. The `ml-training-and-shap_values_plots` folder contains notebooks and outputs for the machine-learning and explainability workflow. This includes the main XGBoost and SHAP notebook(s), a beeswarm plotting notebook, target-feature tables for machine-learning training, SHAP summary tables, and per-sample SHAP value archives. The `watershed_mesh_and_ats_input` folder contains ATS-related watershed inputs and supporting materials. This includes mesh and shape products, ATS-readable LAI and meteorological forcing inputs, selected ATS spinup and transient-run files, and a watershed workflow example notebook. Subdirectories are organized by watershed where applicable.All files are .cpg (codepage files), .csv (comma-separated values), .dbf (database files), .exo (Exodus mesh format), .h5 (HDF5 format), .ipynb (Jupyter notebooks), .pkl (Python pickle), .prj (projection files), .sh (shell scripts), .shp (shapefile geometry), .shx (shapefile index), .txt (text files), or .xml (markup data).
Research Organization:
River Corridor Hydro-biogeochemistry from Molecular to Multi-Basin Scales SFA
Sponsoring Organization:
ESS-DIVE; U.S. DOE > Office of Science > Biological and Environmental Research (BER)
DOE Contract Number:
AC02-05CH11231;
Other Award/Contract Number:
DOE Award #54737
OSTI ID:
3023133
Country of Publication:
United States
Language:
English