An adaptive Hessian approximated stochastic gradient MCMC method

Wang, Yating; Deng, Wei; Lin, Guang

doi:10.1016/j.jcp.2021.110150

Title: An adaptive Hessian approximated stochastic gradient MCMC method

Journal Article · Thu Feb 04 00:00:00 EST 2021 · Journal of Computational Physics

DOI:https://doi.org/10.1016/j.jcp.2021.110150· OSTI ID:1853727

Wang, Yating ^[1]; Deng, Wei ^[1]; Lin, Guang ^[2]

Purdue Univ., West Lafayette, IN (United States). Dept. of Mathematics
Purdue Univ., West Lafayette, IN (United States). Dept. of Mathematics. School of Mechanical Engineering. Dept. of Statistics. Dept. of Earth, Atmospheric, and Planetary Sciences

Bayesian approaches have been successfully integrated into training deep neural networks. One popular family is stochastic gradient Markov chain Monte Carlo methods (SG-MCMC), which have gained increasing interest due to their ability to handle large datasets and the potential to avoid overfitting. Although standard SG-MCMC methods have shown great performance in a variety of problems, they may be inefficient when the random variables in the target posterior densities have scale differences or are highly correlated. Here, we present an adaptive Hessian approximated stochastic gradient MCMC method to incorporate local geometric information while sampling from the posterior. The idea is to apply stochastic approximation (SA) to sequentially update a preconditioning matrix at each iteration. The preconditioner possesses second-order information and can guide the random walk of a sampler efficiently. Instead of computing and saving the full Hessian of the log posterior, we use limited memory of the samples and their stochastic gradients to approximate the inverse Hessian-vector multiplication in the updating formula. Moreover, by smoothly optimizing the preconditioning matrix via SA, our proposed algorithm can asymptotically converge to the target distribution with a controllable bias under mild conditions. To reduce the training and testing computational burden, we adopt a magnitude-based weight pruning method to enforce the sparsity of the network. Our method is user-friendly and demonstrates better learning results compared to standard SG-MCMC updating rules. The approximation of inverse Hessian alleviates storage and computational complexities for large dimensional models. Numerical experiments are performed on several problems, including sampling from 2D correlated distribution, synthetic regression problems, and learning the numerical solutions of heterogeneous elliptic PDE. The numerical results demonstrate great improvement in both the convergence rate and accuracy.

View Accepted Manuscript (DOE)

View Accepted Manuscript (Publisher)

Cite

Export

Save

Research Organization:: Purdue Univ., West Lafayette, IN (United States)

Sponsoring Organization:: National Science Foundation (NSF); US Army Research Office (ARO); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

Grant/Contract Number:: SC0021142; DMS-1555072; DMS-1736364; CMMI-1634832; CMMI-1560834; W911NF-15-1-0562

OSTI ID:: 1853727

Alternate ID(s):: OSTI ID: 1775932

Journal Information:: Journal of Computational Physics, Vol. 432, Issue C; ISSN 0021-9991

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

References (9)

A mixed multiscale finite element method for elliptic problems with oscillating coefficients Chen, Zhiming; Hou, Thomas Y. Mathematics of Computation, Vol. 72, Issue 242 https://doi.org/10.1090/S0025-5718-02-01441-2	journal	June 2002
A Stochastic Quasi-Newton Method for Large-Scale Optimization Byrd, R. H.; Hansen, S. L.; Nocedal, Jorge SIAM Journal on Optimization, Vol. 26, Issue 2 https://doi.org/10.1137/140954362	journal	January 2016
Mixed Generalized Multiscale Finite Element Methods and Applications Chung, Eric T.; Efendiev, Yalchin; Lee, Chak Shing Multiscale Modeling & Simulation, Vol. 13, Issue 1 https://doi.org/10.1137/140970574	journal	January 2015
Riemann manifold Langevin and Hamiltonian Monte Carlo methods: Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods Girolami, Mark; Calderhead, Ben Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 73, Issue 2 https://doi.org/10.1111/j.1467-9868.2010.00765.x	journal	March 2011
Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization Wang, Xiao; Ma, Shiqian; Goldfarb, Donald SIAM Journal on Optimization, Vol. 27, Issue 2 https://doi.org/10.1137/15M1053141	journal	January 2017
Langevin diffusions and the Metropolis-adjusted Langevin algorithm Xifara, T.; Sherlock, C.; Livingstone, S. Statistics & Probability Letters, Vol. 91 https://doi.org/10.1016/j.spl.2014.04.002	journal	August 2014
Efficient deep learning techniques for multiphase flow simulation in heterogeneous porousc media Wang, Yating; Lin, Guang Journal of Computational Physics, Vol. 401 https://doi.org/10.1016/j.jcp.2019.108968	journal	January 2020
A Stochastic Approximation Method Robbins, Herbert; Monro, Sutton The Annals of Mathematical Statistics, Vol. 22, Issue 3 https://doi.org/10.1214/aoms/1177729586	journal	September 1951
Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring Ahn, Sungjin; Korattikara, Anoop; Welling, Max arXiv https://doi.org/10.48550/arxiv.1206.6380	preprint	January 2012

Similar Records

Bayesian sparse learning with preconditioned stochastic gradient MCMC and its applications

Journal Article · Wed Feb 03 00:00:00 EST 2021 · Journal of Computational Physics · OSTI ID:1853727

Wang, Yating; Deng, Wei; Lin, Guang

Flow-driven spectral chaos (FSC) method for long-time integration of second-order stochastic dynamical systems

Journal Article · Mon May 31 00:00:00 EDT 2021 · Journal of Computational and Applied Mathematics · OSTI ID:1853727

Esquivel, Hugo; Prakash, Arun; Lin, Guang

Flow-driven spectral chaos (FSC) method for simulating long-time dynamics of arbitrary-order non-linear stochastic dynamical systems

Journal Article · Fri Feb 19 00:00:00 EST 2021 · Journal of Computational Physics · OSTI ID:1853727

Esquivel, Hugo; Prakash, Arun; Lin, Guang

Related Subjects

97 MATHEMATICS AND COMPUTING
adaptive Bayesian method
deep learning
Hessian approximated stochastic gradient MCMC
stochastic approximation
limited memory BFGS
highly correlated density

Title: An adaptive Hessian approximated stochastic gradient MCMC method

Citation Formats

References (9)

Similar Records

Related Subjects