Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Supervised Gamma Process Poisson Factorization

Thesis/Dissertation ·
OSTI ID:1182679
 [1]
  1. Univ. of Texas, Austin, TX (United States)

This thesis develops the supervised gamma process Poisson factorization (S- GPPF) framework, a novel supervised topic model for joint modeling of count matrices and document labels. S-GPPF is fully generative and nonparametric: document labels and count matrices are modeled under a uni ed probabilistic framework and the number of latent topics is controlled automatically via a gamma process prior. The framework provides for multi-class classification of documents using a generative max-margin classifier. Several recent data augmentation techniques are leveraged to provide for exact inference using a Gibbs sampling scheme. The first portion of this thesis reviews supervised topic modeling and several key mathematical devices used in the formulation of S-GPPF. The thesis then introduces the S-GPPF generative model and derives the conditional posterior distributions of the latent variables for posterior inference via Gibbs sampling. The S-GPPF is shown to exhibit state-of-the-art performance for joint topic modeling and document classification on a dataset of conference abstracts, beating out competing supervised topic models. The unique properties of S-GPPF along with its competitive performance make it a novel contribution to supervised topic modeling.

Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
AC04-94AL85000
OSTI ID:
1182679
Report Number(s):
SAND2015--3996T; 583918
Country of Publication:
United States
Language:
English

Similar Records

Self-supervised Representation Learning for Astronomical Images
Journal Article · Mon Apr 26 00:00:00 EDT 2021 · The Astrophysical Journal. Letters · OSTI ID:1813371

Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences
Journal Article · Mon Jul 15 00:00:00 EDT 2019 · Nature Communications · OSTI ID:1624170

Supervised Semantic Classification for Nuclear Proliferation Monitoring
Conference · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:1015691

Related Subjects