Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Challenges in Markov Chain Monte Carlo for Bayesian Neural Networks

Journal Article · · Statistical Science
DOI:https://doi.org/10.1214/21-sts840· OSTI ID:1976073

Markov chain Monte Carlo (MCMC) methods have not been broadly adopted in Bayesian neural networks (BNNs). This paper initially reviews the main challenges in sampling from the parameter posterior of a neural network via MCMC. Such challenges culminate to lack of convergence to the parameter posterior. Nevertheless, this paper shows that a nonconverged Markov chain, generated via MCMC sampling from the parameter space of a neural network, can yield via Bayesian marginalization a valuable posterior predictive distribution of the output of the neural network. Further, classification examples based on multilayer perceptrons showcase highly accurate posterior predictive distributions. The postulate of limited scope for MCMC developments in BNNs is partially valid; an asymptotically exact parameter posterior seems less plausible, yet an accurate posterior predictive distribution is a tenable research avenue.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1976073
Journal Information:
Statistical Science, Journal Name: Statistical Science Journal Issue: 3 Vol. 37; ISSN 0883-4237
Publisher:
Institute of Mathematical StatisticsCopyright Statement
Country of Publication:
United States
Language:
English

References (56)

Approximation by superpositions of a sigmoidal function journal December 1989
Approximation capabilities of multilayer feedforward networks journal January 1991
Exploring weight symmetry in deep neural networks journal October 2019
Optimal acceptance rates for Metropolis algorithms: Moving beyond 0.234 journal December 2008
Variational Inference: A Review for Statisticians journal July 2016
‘Un Moyen Puissant de Vulgarisation Artistique’. Reproducing Salon Pictures in Parisian Illustrated Weekly Magazinesc.1860–c.1895: From Wood Engraving to the Half Tone Screen (and Back) journal August 2016
Effnet: An Efficient Structure for Convolutional Neural Networks conference October 2018
Prior Probabilities journal January 1968
Zero Variance Differential Geometric Markov Chain Monte Carlo Algorithms journal March 2014
Anti-Obesity Effect of Chitosan Oligosaccharide Capsules (COSCs) in Obese Rats by Ameliorating Leptin Resistance and Adipogenesis journal June 2018
Online but Accurate Inference for Latent Variable Models with Local Gibbs Sampling text January 2016
Understanding Priors in Bayesian Neural Networks at the Unit Level preprint January 2018
Revisiting the Gelman-Rubin Diagnostic preprint January 2018
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape preprint January 2019
A Practical Sequential Stopping Rule for High-Dimensional Markov Chain Monte Carlo dataset January 2016
A Practical Sequential Stopping Rule for High-Dimensional Markov Chain Monte Carlo dataset January 2016
Accelerating MCMC algorithms journal June 2018
The Elements of Statistical Learning book January 2009
Priors for Neural Networks book January 2004
Approximation by superpositions of a sigmoidal function journal December 1989
Default Priors for Neural Network Classification journal June 2007
Approximation capabilities of multilayer feedforward networks journal January 1991
Consistency of posterior distributions for neural networks journal July 2000
Multivariate initial sequence estimators in Markov chain Monte Carlo journal July 2017
The perceptron: A probabilistic model for information storage and organization in the brain. journal January 1958
Equation of State Calculations by Fast Computing Machines journal June 1953
Speeding Up MCMC by Efficient Data Subsampling journal July 2018
A Practical Sequential Stopping Rule for High-Dimensional Markov Chain Monte Carlo journal July 2016
Gaussian Variational Approximation With a Factor Covariance Structure journal June 2018
Monte Carlo sampling methods using Markov chains and their applications journal April 1970
Multivariate output analysis for Markov chain Monte Carlo journal April 2019
Xception: Deep Learning with Depthwise Separable Convolutions conference July 2017
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices conference June 2018
What is the best multi-stage architecture for object recognition? conference September 2009
Effnet: An Efficient Structure for Convolutional Neural Networks conference October 2018
Prior Probabilities journal January 1968
Dealing with label switching in mixture models journal November 2000
Marginal likelihood estimation via power posteriors journal July 2008
Reference Posterior Distributions for Bayesian Inference journal January 1979
On the Geometry of Feedforward Neural Network Error Surfaces journal November 1993
Bayesian Regularization and Pruning Using a Laplace Prior journal January 1995
MCMC Using Hamiltonian Dynamics book May 2011
Markov Chain Monte Carlo in Practice book December 1995
Bayesian Methods for Neural Networks and Related Models journal February 2004
Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors journal February 2017
Merging MCMC Subposteriors through Gaussian-Process Approximations journal June 2018
Deep Learning: A Bayesian Perspective journal December 2017
Rank-Normalization, Folding, and Localization: An Improved Rˆ for Assessing Convergence of MCMC (with Discussion) journal June 2021
Inference from Iterative Simulation Using Multiple Sequences journal November 1992
Structured Markov Chain Monte Carlo
  • Sargent, Daniel J.; Hodges, James S.; Carlin, Bradley P.
  • Journal of Computational and Graphical Statistics, Vol. 9, Issue 2 https://doi.org/10.2307/1390651
journal June 2000
General Methods for Monitoring Convergence of Iterative Simulations journal December 1998
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review journal June 1996
Markov Chain Monte Carlo in Practice: A Roundtable Discussion journal May 1998
An Efficient Minibatch Acceptance Test for Metropolis-Hastings conference July 2018
Perturbation theory for Markov chains via Wasserstein distance journal November 2018
Lightweight Deep Convolutional Network for Tiny Object Recognition conference January 2018

Cited By (3)

Parameter Estimation in the Age of Degeneracy and Unidentifiability journal January 2022
Bayesian neural networks and dimensionality reduction preprint January 2020
Quantile Regression Neural Networks: A Bayesian Approach text January 2020