Data and scripts from: “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: noisy and missing values repair and uncertainty quantification”
- Lawrence Berkeley National Laboratory
This data package includes data and scripts from the manuscript “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: noisy and missing values repair and uncertainty quantification”.The study addressed common challenges faced in environmental sensing and modeling, including uncertain input data, missing sensor observations, and high-dimensional datasets with interrelated but redundant variables. Point-scaled meteorological and soil sensor observations were perturbed with noises and missing values, and denoising autoencoder (DAE) neural networks were developed to reconstruct the perturbed data and further predict evapotranspiration. This study concluded that (1) the reconstruction quality of each variable depends on its cross-correlation and alignment to the underlying data structure, (2) uncertainties from the models were overall stronger than those from the data corruption, and (3) there was a tradeoff between reducing bias and reducing variance when evaluating the uncertainty of the machine learning models.This package includes:(1) Four ipython scripts (.ipynb): “DAE_train.ipynb” trains and evaluates DAE neural networks, “DAE_predict.ipynb” makes predictions from the trained DAE models, “ET_train.ipynb” trains and evaluates ET prediction neural networks, and “ET_predict.ipynb” makes predictions from trained ET models.(2) One python file (.py): “methods.py” includes all user-defined functions and python codes used in the ipython scripts.(3) A “sub_models” folder that includes five trained DAE neural networks (in pytorch format, .pt), which could be used to ingest input data before being fed to the downstream ET models in ‘ET_train.ipynb” or ‘ET_predict.ipynb’.(4) Two data files (.csv). Daily meteorological, vegetation, and soil data is in “df_data.csv”, where “df_meta.csv” contains the location and time information of “df_data.csv”. Each row (index) in “df_meta.csv” corresponds to each row in “df_data.csv”. These data files are formatted to follow the data structure requirements and be directly used in the ipython scripts, and they have been shuffled chronologically to train machine learning models. The meteorological and soil data was collected using point sensors between 2019-2023 at(4.a) Three shrub-dominated field sites in East River, Colorado (named “ph1”, “ph2” and “sg5” in “df_meta.csv”, where “ph1” and “ph2” were located at PumpHouse Hillslopes, and “sg5” was at Snodgrass Mountain meadow) and(4.b) One outdoor, mesoscale, and herbaceous-dominated experiment in Berkeley, California (named “tb” in “df_meta.csv”, short for Smartsoils Testbed at Lawrence Berkeley National Lab).- See "df_data_dd.csv" and "df_meta_dd.csv" for variable descriptions and the Methods section for additional data processing steps. See "flmd.csv" and "README.txt" for brief file descriptions.- All ipython scripts and python files are written in and require PYTHON language software.
- Research Organization:
- Watershed Function SFA
- Sponsoring Organization:
- U.S. DOE > Office of Science > Biological and Environmental Research (BER)
- DOE Contract Number:
- AC02-05CH11231
- OSTI ID:
- 2561511
- Country of Publication:
- United States
- Language:
- English
Similar Records
Denoising Autoencoder for Reconstructing Sensor Observation Data and Predicting Evapotranspiration: Noisy and Missing Values Repair and Uncertainty Quantification
Estimating Watershed Subsurface Permeability From Stream Discharge Data using Deep Neural Networks, Frontiers in Earth Science: Dataset
CMLM (Co-Optimized Machine-Learned Manifolds) [SWR-23-41]
Journal Article
·
Tue Sep 30 20:00:00 EDT 2025
· Water Resources Research
·
OSTI ID:3006214
Estimating Watershed Subsurface Permeability From Stream Discharge Data using Deep Neural Networks, Frontiers in Earth Science: Dataset
Dataset
·
Tue Dec 31 23:00:00 EST 2019
·
OSTI ID:1756193
CMLM (Co-Optimized Machine-Learned Manifolds) [SWR-23-41]
Software
·
Mon Feb 05 19:00:00 EST 2024
·
OSTI ID:code-122681
Related Subjects
54 ENVIRONMENTAL SCIENCES
Data and model uncertainty
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC PRESSURE > ATMOSPHERIC PRESSURE MEASUREMENTS
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > LONGWAVE RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > NET RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > SHORTWAVE RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC TEMPERATURE > SURFACE TEMPERATURE > AIR TEMPERATURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC TEMPERATURE > SURFACE TEMPERATURE > SKIN TEMPERATURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WATER VAPOR > WATER VAPOR INDICATORS > VAPOR PRESSURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WATER VAPOR > WATER VAPOR PROCESSES > EVAPOTRANSPIRATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WINDS > SURFACE WINDS > WIND SPEED
EARTH SCIENCE > ATMOSPHERE > PRECIPITATION > PRECIPITATION AMOUNT > 24 HOUR PRECIPITATION AMOUNT
EARTH SCIENCE > BIOSPHERE > VEGETATION > VEGETATION INDEX > LEAF AREA INDEX (LAI)
EARTH SCIENCE > LAND SURFACE > SOILS > SOIL HEAT BUDGET
EARTH SCIENCE > LAND SURFACE > SOILS > SOIL MOISTURE/WATER CONTENT
ESS-DIVE CSV File Formatting Guidelines Reporting Format
ESS-DIVE File Level Metadata Reporting Format
Evapotranspiration
Machine learning
Data and model uncertainty
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC PRESSURE > ATMOSPHERIC PRESSURE MEASUREMENTS
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > LONGWAVE RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > NET RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC RADIATION > SHORTWAVE RADIATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC TEMPERATURE > SURFACE TEMPERATURE > AIR TEMPERATURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC TEMPERATURE > SURFACE TEMPERATURE > SKIN TEMPERATURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WATER VAPOR > WATER VAPOR INDICATORS > VAPOR PRESSURE
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WATER VAPOR > WATER VAPOR PROCESSES > EVAPOTRANSPIRATION
EARTH SCIENCE > ATMOSPHERE > ATMOSPHERIC WINDS > SURFACE WINDS > WIND SPEED
EARTH SCIENCE > ATMOSPHERE > PRECIPITATION > PRECIPITATION AMOUNT > 24 HOUR PRECIPITATION AMOUNT
EARTH SCIENCE > BIOSPHERE > VEGETATION > VEGETATION INDEX > LEAF AREA INDEX (LAI)
EARTH SCIENCE > LAND SURFACE > SOILS > SOIL HEAT BUDGET
EARTH SCIENCE > LAND SURFACE > SOILS > SOIL MOISTURE/WATER CONTENT
ESS-DIVE CSV File Formatting Guidelines Reporting Format
ESS-DIVE File Level Metadata Reporting Format
Evapotranspiration
Machine learning