DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes

Journal Article · · PLoS ONE

A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection thus becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. Theoretical computational complexities are O ( N P 2 ) in training and O ( P ) per point in inference, where N is the number of instances and P the number of expansion terms. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a ‘Susceptible, Infected, Recovered’ (SIR) toy problem, along with the experimental ‘Cascaded Tanks’ benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.

Sponsoring Organization:
USDOE
OSTI ID:
2447520
Journal Information:
PLoS ONE, Journal Name: PLoS ONE Journal Issue: 9 Vol. 19; ISSN 1932-6203
Publisher:
Public Library of Science (PLoS)Copyright Statement
Country of Publication:
United States
Language:
English

References (8)

Reduced-order model for microstructure evolution prediction in the electrodes of solid oxide fuel cell with dynamic discrepancy reduced modeling journal March 2019
Probabilistic Model Building with Uncertainty Quantification and Propagation for a Dynamic Fixed Bed CO 2 Capture Process journal December 2019
Discovering governing equations from data by sparse identification of nonlinear dynamical systems journal March 2016
Upscaling Uncertainty with Dynamic Discrepancy for a Multi-Scale Carbon Capture System journal January 2017
When Gaussian Process Meets Big Data: A Review of Scalable GPs journal November 2020
Scalable Gaussian Processes for Data-Driven Design Using Big Data With Categorical Factors journal September 2021
Variable Selection in Bayesian Smoothing Spline ANOVA Models: Application to Deterministic Computer Codes journal May 2009
Three free data sets for development and benchmarking in nonlinear system identification conference July 2013

Similar Records

Related Subjects