DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Efficient surrogate modeling methods for large-scale Earth system models based on machine-learning techniques

Journal Article · · Geoscientific Model Development (Online)

Abstract. Improving predictive understanding of Earth system variability and change requires data–model integration. Efficient data–model integration for complex models requires surrogate modeling to reduce model evaluation time. However, building a surrogate of a large-scale Earth system model (ESM) with many output variables is computationally intensive because it involves a large number of expensive ESM simulations. In this effort, we propose an efficient surrogate method capable of using a few ESM runs to build an accurate and fast-to-evaluate surrogate system of model outputs over large spatial and temporal domains. We first use singular value decomposition to reduce the output dimensions and then use Bayesian optimization techniques to generate an accurate neural network surrogate model based on limited ESM simulation samples. Our machine-learning-based surrogate methods can build and evaluate a large surrogate system of many variables quickly. Thus, whenever the quantities of interest change, such as a different objective function, a new site, and a longer simulation time, we can simply extract the information of interest from the surrogate system without rebuilding new surrogates, which significantly reduces computational efforts. We apply the proposed method to a regional ecosystem model to approximate the relationship between eight model parameters and 42 660 carbon flux outputs. Results indicate that using only 20 model simulations, we can build an accurate surrogate system of the 42 660 variables, wherein the consistency between the surrogate prediction and actual model simulation is 0.93 and the mean squared error is 0.02. This highly accurate and fast-to-evaluate surrogate system will greatly enhance the computational efficiency of data–model integration to improve predictions and advance our understanding of the Earth system.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER)
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1513382
Journal Information:
Geoscientific Model Development (Online), Vol. 12, Issue 5; ISSN 1991-9603
Publisher:
European Geosciences UnionCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 19 works
Citation information provided by
Web of Science

References (20)

Stochastic Gradient Descent Tricks book January 2012
Dimensionality Reduction for Complex Models via Bayesian Compressive Sensing journal January 2014
The REFLEX project: Comparing different algorithms and implementations for the inversion of a terrestrial ecosystem model against eddy covariance data journal October 2009
CH 4 parameter estimation in CLM4.5bgc using surrogate global optimization journal January 2015
Calibration of the E3SM Land Model Using Surrogate-Based Global Optimization journal June 2018
Crop physiology calibration in the CLM journal January 2015
Bayesian calibration of terrestrial ecosystem models: a study of advanced Markov chain Monte Carlo methods journal January 2017
Assessment of probability density estimation methods: Parzen window and finite Gaussian mixtures conference January 2006
Review of surrogate modeling in water resources: REVIEW journal July 2012
An improved analysis of forest carbon dynamics using data assimilation journal January 2005
Taking the Human Out of the Loop: A Review of Bayesian Optimization journal January 2016
Multi-objective parameter optimization of common land model using adaptive surrogate modeling journal January 2015
On the applicability of surrogate-based Markov chain Monte Carlo-Bayesian inversion to the Community Land Model: Case studies at flux tower sites: SURROGATE-BASED MCMC FOR CLM journal July 2016
The Impact of Parametric Uncertainties on Biogeochemistry in the E3SM Land Model journal February 2018
Comparison of surrogate models with different methods in groundwater remediation process journal October 2014
Special Section on Multidisciplinary Design Optimization: Metamodeling in Multidisciplinary Design Optimization: How Far Have We Really Come? journal April 2014
Bayesian Calibration of the Community Land Model Using Surrogates journal January 2015
Hyperopt: A Python Library for Optimizing the Hyperparameters of Machine Learning Algorithms conference January 2013
The REFLEX project: Comparing different algorithms and implementations for the inversion of a terrestrial ecosystem model against eddy covariance data journal October 2009
Voice conversion using Artificial Neural Networks conference April 2009

Cited By (2)

DeepClimGAN: A High-Resolution Climate Data Generator preprint January 2020
Extending a land-surface model with Sphagnum moss to simulate responses of a northern temperate bog to whole ecosystem warming and elevated CO2 journal January 2021

Figures / Tables (15)