DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A localized ensemble of approximate Gaussian processes for fast sequential emulation

Abstract

More attention has been given to the computational cost associated with the fitting of an emulator. Substantially less attention is given to the computational cost of using that emulator for prediction. This is primarily because the cost of fitting an emulator is usually far greater than that of obtaining a single prediction, and predictions can often be obtained in parallel. In many settings, especially those requiring Markov Chain Monte Carlo, predictions may arrive sequentially and parallelization is not possible. In this case, using an emulator procedure which can produce accurate predictions efficiently can lead to substantial time savings in practice. In this paper, we propose a global model approximate Gaussian process framework via extension of a popular local approximate Gaussian process (laGP) framework. Our proposed emulator can be viewed as a treed Gaussian process where the leaf nodes are laGP models, and the tree structure is learned greedily as a function of the prediction stream. The suggested method (called leapGP) has interpretable tuning parameters which control the time-memory trade-off. One reasonable choice of settings leads to an emulator with a $$\mathscr{O}$$(N2) training cost and makes predictions rapidly with an asymptotic amortized cost of $$\mathscr{O}(\sqrt{N})$$

Authors:
ORCiD logo [1];  [2]; ORCiD logo [2]
  1. Statistical Sciences Los Alamos National Laboratory Los Alamos New Mexico USA
  2. Statistical Sciences Sandia National Laboratories Albuquerque New Mexico USA
Publication Date:
Research Org.:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program
OSTI Identifier:
1972761
Alternate Identifier(s):
OSTI ID: 1972762; OSTI ID: 1975649
Report Number(s):
LA-UR-22-30978
Journal ID: ISSN 2049-1573; e576
Grant/Contract Number:  
89233218CNA000001
Resource Type:
Published Article
Journal Name:
Stat
Additional Journal Information:
Journal Name: Stat Journal Volume: 12 Journal Issue: 1; Journal ID: ISSN 2049-1573
Publisher:
Wiley Blackwell (John Wiley & Sons)
Country of Publication:
United Kingdom
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Mathematics; Gaussian process; approximation; emulation; MCMC; computer models; local approximate GPs

Citation Formats

Rumsey, Kellin N., Huerta, Gabriel, and Derek Tucker, J. A localized ensemble of approximate Gaussian processes for fast sequential emulation. United Kingdom: N. p., 2023. Web. doi:10.1002/sta4.576.
Rumsey, Kellin N., Huerta, Gabriel, & Derek Tucker, J. A localized ensemble of approximate Gaussian processes for fast sequential emulation. United Kingdom. https://doi.org/10.1002/sta4.576
Rumsey, Kellin N., Huerta, Gabriel, and Derek Tucker, J. Fri . "A localized ensemble of approximate Gaussian processes for fast sequential emulation". United Kingdom. https://doi.org/10.1002/sta4.576.
@article{osti_1972761,
title = {A localized ensemble of approximate Gaussian processes for fast sequential emulation},
author = {Rumsey, Kellin N. and Huerta, Gabriel and Derek Tucker, J.},
abstractNote = {More attention has been given to the computational cost associated with the fitting of an emulator. Substantially less attention is given to the computational cost of using that emulator for prediction. This is primarily because the cost of fitting an emulator is usually far greater than that of obtaining a single prediction, and predictions can often be obtained in parallel. In many settings, especially those requiring Markov Chain Monte Carlo, predictions may arrive sequentially and parallelization is not possible. In this case, using an emulator procedure which can produce accurate predictions efficiently can lead to substantial time savings in practice. In this paper, we propose a global model approximate Gaussian process framework via extension of a popular local approximate Gaussian process (laGP) framework. Our proposed emulator can be viewed as a treed Gaussian process where the leaf nodes are laGP models, and the tree structure is learned greedily as a function of the prediction stream. The suggested method (called leapGP) has interpretable tuning parameters which control the time-memory trade-off. One reasonable choice of settings leads to an emulator with a $\mathscr{O}$(N2) training cost and makes predictions rapidly with an asymptotic amortized cost of $\mathscr{O}(\sqrt{N})$},
doi = {10.1002/sta4.576},
journal = {Stat},
number = 1,
volume = 12,
place = {United Kingdom},
year = {Fri May 05 00:00:00 EDT 2023},
month = {Fri May 05 00:00:00 EDT 2023}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1002/sta4.576

Save / Share:

Works referenced in this record:

A General Framework for Vecchia Approximations of Gaussian Processes
journal, February 2021

  • Katzfuss, Matthias; Guinness, Joseph
  • Statistical Science, Vol. 36, Issue 1
  • DOI: 10.1214/19-STS755

Active Learning with Statistical Models
journal, January 1996

  • Cohn, D. A.; Ghahramani, Z.; Jordan, M. I.
  • Journal of Artificial Intelligence Research, Vol. 4
  • DOI: 10.1613/jair.295

Multivariate Adaptive Regression Splines
journal, March 1991


Fast matrix algebra for Bayesian model calibration
journal, December 2020


Simulated Annealing
journal, February 1993


tgp : An R Package for Bayesian Nonstationary, Semiparametric Nonlinear Regression and Design by Treed Gaussian Process Models
journal, January 2007


Bayesian Treed Gaussian Process Models With an Application to Computer Modeling
journal, September 2008

  • Gramacy, Robert B.; Lee, Herbert K. H.
  • Journal of the American Statistical Association, Vol. 103, Issue 483
  • DOI: 10.1198/016214508000000689

Local Gaussian process regression for real-time model-based robot control
conference, September 2008

  • Duy Nguyen-Tuong, ; Peters, J.
  • 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems
  • DOI: 10.1109/IROS.2008.4650850

Bkd-Tree: A Dynamic Scalable kd-Tree
book, January 2003


Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms
book, January 2019


Bayesian calibration of computer models
journal, August 2001

  • Kennedy, Marc C.; O'Hagan, Anthony
  • Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 63, Issue 3
  • DOI: 10.1111/1467-9868.00294

BASS : An R Package for Fitting and Performing Sensitivity Analysis of Bayesian Adaptive Spline Surfaces
journal, January 2020

  • Francom, Devin; Sansó, Bruno
  • Journal of Statistical Software, Vol. 94, Issue 8
  • DOI: 10.18637/jss.v094.i08

A hierarchical sparse Gaussian process for in situ inference in expensive physics simulations
conference, October 2022

  • Rumsey, Kellin; Grosskopf, Michael J.; Lawrence, Earl
  • Applications of Machine Learning 2022
  • DOI: 10.1117/12.2633427

BART: Bayesian additive regression trees
journal, March 2010

  • Chipman, Hugh A.; George, Edward I.; McCulloch, Robert E.
  • The Annals of Applied Statistics, Vol. 4, Issue 1
  • DOI: 10.1214/09-AOAS285

Precision aggregated local models
journal, October 2021

  • Edwards, Adam M.; Gramacy, Robert B.
  • Statistical Analysis and Data Mining: The ASA Data Science Journal, Vol. 14, Issue 6
  • DOI: 10.1002/sam.11547

Approximating likelihoods for large spatial data sets
journal, May 2004

  • Stein, Michael L.; Chi, Zhiyi; Welty, Leah J.
  • Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 66, Issue 2
  • DOI: 10.1046/j.1369-7412.2003.05512.x

Sequential Experiment Design for Contour Estimation From Complex Computer Codes
journal, November 2008


Vecchia Approximations of Gaussian-Process Predictions
journal, June 2020

  • Katzfuss, Matthias; Guinness, Joseph; Gong, Wenlong
  • Journal of Agricultural, Biological and Environmental Statistics, Vol. 25, Issue 3
  • DOI: 10.1007/s13253-020-00401-7

Estimation and Model Identification for Continuous Spatial Processes
journal, January 1988


Optimal Latin-hypercube designs for computer experiments
journal, April 1994


Combining Field Data and Computer Simulations for Calibration and Prediction
journal, January 2004

  • Higdon, Dave; Kennedy, Marc; Cavendish, James C.
  • SIAM Journal on Scientific Computing, Vol. 26, Issue 2
  • DOI: 10.1137/S1064827503426693

Projection Pursuit Regression
journal, December 1981


When Gaussian Process Meets Big Data: A Review of Scalable GPs
journal, November 2020

  • Liu, Haitao; Ong, Yew-Soon; Shen, Xiaobo
  • IEEE Transactions on Neural Networks and Learning Systems, Vol. 31, Issue 11
  • DOI: 10.1109/TNNLS.2019.2957109

laGP : Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R
journal, January 2016


Hilbert space methods for reduced-rank Gaussian process regression
journal, August 2019


Modeling Data from Computer Experiments: An Empirical Comparison of Kriging with MARS and Projection Pursuit Regression
journal, October 2007


Local Gaussian Process Approximation for Large Computer Experiments
journal, April 2015

  • Gramacy, Robert B.; Apley, Daniel W.
  • Journal of Computational and Graphical Statistics, Vol. 24, Issue 2
  • DOI: 10.1080/10618600.2014.914442

Design and analysis of computer experiments when the output is highly correlated over the input space
journal, March 2002

  • Lim, Yong B.; Sacks, Jerome; Studden, W. J.
  • Canadian Journal of Statistics, Vol. 30, Issue 1
  • DOI: 10.2307/3315868

Gaussian Processes in Machine Learning
book, January 2004


A random forest guided tour
journal, April 2016


Nonparametric Machine Learning and Efficient Computation with Bayesian Additive Regression Trees: The BART R Package
journal, January 2021

  • Sparapani, Rodney; Spanbauer, Charles; McCulloch, Robert
  • Journal of Statistical Software, Vol. 97, Issue 1
  • DOI: 10.18637/jss.v097.i01

Genetic Algorithms
journal, July 1992


Vecchia-Approximated Deep Gaussian Processes for Computer Experiments
journal, November 2022


Surrogates
book, January 2020