skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Advances in Bayesian Model Based Clustering Using Particle Learning

Technical Report ·
DOI:https://doi.org/10.2172/1010386· OSTI ID:1010386

Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original implementation of Carvalho et al that allow us to retain the computational advantages of particle learning while improving the suitability of the methodology to the analysis of streaming data and simultaneously facilitating the real time discovery of latent cluster structures. Section 4 demonstrates our methodological enhancements in the context of several simulated and classical data sets, showcasing the use of particle learning methods for online anomaly detection, label generation, drift detection, and semi-supervised classification, none of which would be achievable through a standard MCMC approach. Section 5 concludes with a discussion of future directions for research.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
1010386
Report Number(s):
LLNL-TR-421078; TRN: US201108%%422
Country of Publication:
United States
Language:
English

Similar Records

Risk Analysis of the Space Shuttle: Pre-Challenger Bayeisan Prediction of Failure
Conference · Fri Feb 01 00:00:00 EST 2008 · OSTI ID:1010386

Dynamic Data-Driven Event Reconstruction for Atmospheric Releases
Technical Report · Thu Mar 29 00:00:00 EDT 2007 · OSTI ID:1010386

Multimodal parameter spaces of a complex multi-channel neuron model
Journal Article · Thu Oct 20 00:00:00 EDT 2022 · Frontiers in Systems Neuroscience · OSTI ID:1010386