skip to main content

DOE PAGESDOE PAGES

Title: Stochastic Spectral Descent for Discrete Graphical Models

Interest in deep probabilistic graphical models has in-creased in recent years, due to their state-of-the-art performance on many machine learning applications. Such models are typically trained with the stochastic gradient method, which can take a significant number of iterations to converge. Since the computational cost of gradient estimation is prohibitive even for modestly sized models, training becomes slow and practically usable models are kept small. In this paper we propose a new, largely tuning-free algorithm to address this problem. Our approach derives novel majorization bounds based on the Schatten- norm. Intriguingly, the minimizers of these bounds can be interpreted as gradient methods in a non-Euclidean space. We thus propose using a stochastic gradient method in non-Euclidean space. We both provide simple conditions under which our algorithm is guaranteed to converge, and demonstrate empirically that our algorithm leads to dramatically faster training and improved predictive ability compared to stochastic gradient descent for both directed and undirected graphical models.
Authors:
 [1] ;  [2] ;  [2] ;  [3] ;  [2]
  1. Columbia Univ., New York, NY (United States)
  2. Ecole Polytechnique Federale Lausanne (Switzlerland)
  3. Duke Univ., Durham, NC (United States)
Publication Date:
Grant/Contract Number:
NA0002534
Type:
Accepted Manuscript
Journal Name:
IEEE Journal of Selected Topics in Signal Processing
Additional Journal Information:
Journal Volume: 10; Journal Issue: 2; Journal ID: ISSN 1932-4553
Publisher:
IEEE
Research Org:
Univ. of Michigan, Ann Arbor, MI (United States)
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA), Office of Nonproliferation and Verification Research and Development (NA-22)
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Gradient methods; graphical models; maximum likelihood estimation; Monte Carlo simulation methods; Boltz-mann distributions.
OSTI Identifier:
1367144