HighDimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps
Abstract
This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen highdimensional interpolation by inferring nonlinear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a lowdimensional manifold embedded in an ambient highdimensional space associated with the data. It relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the highdimensional ambient space. The proposed method is applicable to arbitrarily highdimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of themore »
 Authors:
 Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing
 Univ. of Southern California, Los Angeles, CA (United States). Sonny Astani Dept. of Civil and Environmental Engineering
 Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Atmospheric, Earth and Energy Division
 Publication Date:
 Research Org.:
 Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
 Sponsoring Org.:
 USDOE Laboratory Directed Research and Development (LDRD) Program
 OSTI Identifier:
 1400095
 Report Number(s):
 LLNLJRNL728760
Journal ID: ISSN 18748961; TRN: US1703098
 Grant/Contract Number:
 AC5207NA27344
 Resource Type:
 Journal Article: Accepted Manuscript
 Journal Name:
 Mathematical Geosciences
 Additional Journal Information:
 Journal Volume: 50; Journal Issue: 1; Journal ID: ISSN 18748961
 Publisher:
 Springer
 Country of Publication:
 United States
 Language:
 English
 Subject:
 58 GEOSCIENCES; 97 MATHEMATICS AND COMPUTING; Gaussian process regression; Kriging; Interpolation on manifold; Intrinsic interpolation; Intrinsic metrics; Diffusion distance
Citation Formats
Thimmisetty, Charanraj A., Ghanem, Roger G., White, Joshua A., and Chen, Xiao. HighDimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps. United States: N. p., 2017.
Web. doi:10.1007/s110040179705y.
Thimmisetty, Charanraj A., Ghanem, Roger G., White, Joshua A., & Chen, Xiao. HighDimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps. United States. doi:10.1007/s110040179705y.
Thimmisetty, Charanraj A., Ghanem, Roger G., White, Joshua A., and Chen, Xiao. 2017.
"HighDimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps". United States.
doi:10.1007/s110040179705y.
@article{osti_1400095,
title = {HighDimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps},
author = {Thimmisetty, Charanraj A. and Ghanem, Roger G. and White, Joshua A. and Chen, Xiao},
abstractNote = {This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen highdimensional interpolation by inferring nonlinear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a lowdimensional manifold embedded in an ambient highdimensional space associated with the data. It relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the highdimensional ambient space. The proposed method is applicable to arbitrarily highdimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of the geological structural properties is observed with the proposed method.},
doi = {10.1007/s110040179705y},
journal = {Mathematical Geosciences},
number = 1,
volume = 50,
place = {United States},
year = 2017,
month =
}

We present a novel class of models for Type Ia supernova timeevolving spectral energy distributions (SEDs) and absolute magnitudes: they are each modeled as stochastic functions described by Gaussian processes. The values of the SED and absolute magnitudes are defined through welldefined regression prescriptions, so that data directly inform the models. As a proof of concept, we implement a model for synthetic photometry built from the spectrophotometric time series from the Nearby Supernova Factory. Absolute magnitudes at peak B brightness are calibrated to 0.13 mag in the g band and to as low as 0.09 mag in the z =more »

Gaussian models for genetic linkage analysis using complete highresolution maps of identity by descent
Gaussianprocess models are developed to detect genetic linkage using complete highresolution maps of identity by descent between affected relative pairs. Approximations are given for the significance level and power of the likelihoodratio test of no linkage and for likelihoodratio confidence regions for trait loci. The sample sizes required to detect linkage by using different classes of affected relative pairs are compared, and the problem of combining data from different classes of relatives is discussed. 23 refs., 2 figs. 
Automated fit of highdimensional potential energy surfaces using cluster analysis and interpolation over descriptors of chemical environment
We present a method for fitting highdimensional potential energy surfaces that is almost fully automated, can be applied to systems with various chemical compositions, and involves no particular choice of function form. We tested it on four systems: Ag{sub 20}, Sn{sub 6}Pb{sub 6}, Si{sub 10}, and Li{sub 8}. The cost for energy evaluation is smaller than the cost of a density functional theory (DFT) energy evaluation by a factor of 1500 for Li{sub 8}, and 60 000 for Ag{sub 20}. We achieved intermediate accuracy (errors of 0.4 to 0.8 eV on atomization energies, or, 1% to 3% on cohesive energies) withmore » 
NEW APPROACHES TO PHOTOMETRIC REDSHIFT PREDICTION VIA GAUSSIAN PROCESS REGRESSION IN THE SLOAN DIGITAL SKY SURVEY
Expanding upon the work of Way and Srivastava we demonstrate how the use of training sets of comparable size continue to make Gaussian process regression (GPR) a competitive approach to that of neural networks and other leastsquares fitting methods. This is possible via new largesize matrix inversion techniques developed for Gaussian processes (GPs) that do not require that the kernel matrix be sparse. This development, combined with a neuralnetwork kernel function appears to give superior results for this problem. Our bestfit results for the Sloan Digital Sky Survey (SDSS) Main Galaxy Sample using u, g, r, i, z filters givesmore »