Denoising Autoencoder for Reconstructing Sensor Observation Data and Predicting Evapotranspiration: Noisy and Missing Values Repair and Uncertainty Quantification
Journal Article
·
· Water Resources Research
Abstract Machine learning (ML) methods applied in scientific research often deal with interrelated features in high‐dimensional data. Reducing data noise and redundancy is needed to increase prediction accuracy and efficiency especially when dealing with data from field sensors. We explored an unsupervised learning method, the denoising autoencoder (DAE), to extract the underlying data structure from noisy raw data in the context of predicting hydrologic quantities from multiple field sensors. These sensors have intrinsic instrumental noise and occasional malfunctions that cause missing values. Our DAE neural network reconstructed meteorological sensor data containing noise and missing values to predict evapotranspiration in a mountainous watershed. The DAE reconstructed the sensor variables with a mean coefficient of determination value of 0.77 across 15 dimensions representing individual sensors. It reduced variance and bias uncertainties compared to a classical autoencoder model. The reconstruction quality varied across dimensions depending on their cross‐correlation and alignment with the underlying data structure. Uncertainties arising from the model structure were overall higher than those resulting from data corruption. We attached the DAE structure to a downstream ET‐prediction neural network in three formats and achieved reasonably accurate ET predictions . The use of the DAE notably reduced variance uncertainty in ET prediction. However, excessive variance reduction may be accompanied by an increase in bias due to the intrinsic bias‐variance tradeoff. Our method of evaluating and reducing uncertainties in aggregated data from different sources can be used to improve predictive models, process understanding, and uncertainty quantification for better water resource management. Plain Language Summary We present a machine learning method, namely the denoising autoencoder, which reduces the effects of data noise and missing values typically present in scientific data sets collected through sensor measurements. This method selects the most relevant information from noisy raw data collected by the instruments and fills in missing values. To demonstrate the effectiveness of our method, we applied it to predict evapotranspiration, a hydrologic variable that represents the water moved from the land surface to the atmosphere through a combination of evaporation and plant water use (transpiration). We also used a random sampling technique (the Monte Carlo method) to compare the uncertainty in the predictions when using the raw and noisy data versus the reconstructed data. The denoising process produced more accurate predictions of evapotranspiration with less uncertainty. Improved predictions of evapotranspiration can lead to a better understanding and accounting of water budgets. This ML approach is broadly suitable for a wide variety of applications that involve noisy sensor data with missing values. Key Points We used a denoising autoencoder (DAE) neural network to reduce noise in meteorological and soil sensor observations by on average We used Monte Carlo sampling to estimate the bias and variance of all model outputs, including uncertainty sources from data and the model We attached the DAE component to a downstream neural network to predict ET with the variance reduced by , compared to that without the DAE
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- US Department of Energy; USDOE Office of Science (SC), Biological and Environmental Research (BER) (SC-23), Climate and Environmental Sciences Division (SC-23.1 )
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 3006214
- Journal Information:
- Water Resources Research, Journal Name: Water Resources Research Journal Issue: 10 Vol. 61
- Country of Publication:
- United States
- Language:
- English
Similar Records
Data and scripts from: “Denoising autoencoder for reconstructing sensor observation data and predicting evapotranspiration: noisy and missing values repair and uncertainty quantification”
Neural network denoising of x-ray images from high-energy-density experiments
Dataset
·
Tue Dec 31 23:00:00 EST 2024
·
OSTI ID:2561511
Neural network denoising of x-ray images from high-energy-density experiments
Journal Article
·
Thu Jun 27 20:00:00 EDT 2024
· Review of Scientific Instruments
·
OSTI ID:2406543