skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming

Journal Article · · Journal of Computational and Graphical Statistics
 [1];  [1];  [1];  [2]
  1. Univ. of Chicago, IL (United States)
  2. Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Lemont, IL (United States)

Maximum likelihood estimation of mixture proportions has a long history, and continues to play an important role in modern statistics, including in development of nonparametric empirical Bayes methods. Maximum likelihood of mixture proportions has traditionally been solved using the expectation maximization (EM) algorithm, but recent work by Koenker and Mizera shows that modern convex optimization techniques-in particular, interior point methods-are substantially faster and more accurate than EM. Here, we develop a new solution based on sequential quadratic programming (SQP). It is substantially faster than the interior point method, and just as accurate. Our approach combines several ideas: first, it solves a reformulation of the original problem; second, it uses an SQP approach to make the best use of the expensive gradient and Hessian computations; third, the SQP iterations are implemented using an active set method to exploit the sparse nature of the quadratic subproblems; fourth, it uses accurate low-rank approximations for more efficient gradient and Hessian computations. We illustrate the benefits of the SQP approach in experiments on synthetic datasets and a large genetic association dataset. In large datasets (n approximate to 106observations,m approximate to 103mixture components), our implementation achieves at least 100-fold reduction in runtime compared with a state-of-the-art interior point solver. Our methods are implemented in Julia and in an R package available on CRAN (). Supplementary materials for this article are available online.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
National Science Foundation (NSF); National Institutes of Health (NIH); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
AC02-06CH11357
OSTI ID:
1660711
Journal Information:
Journal of Computational and Graphical Statistics, Vol. 29, Issue 2; ISSN 1061-8600
Publisher:
Taylor & FrancisCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 6 works
Citation information provided by
Web of Science

References (34)

Mixture Densities, Maximum Likelihood and the EM Algorithm journal April 1984
A Stochastic Quasi-Newton Method for Large-Scale Optimization journal January 2016
Contributions to the Mathematical Theory of Evolution journal January 1894
Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules journal April 2014
Convex Optimization book January 2004
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions journal January 2011
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) journal November 2016
Deconvolution of a Distribution Function journal December 1997
REBayes : An R Package for Empirical Bayes Mixture Methods journal January 2017
Second-Order Stochastic Optimization for Machine Learning in Linear Time text January 2016
Interior-point methods journal December 2000
Julia: A Fast Dynamic Language for Technical Computing preprint January 2012
Nonmonotone Spectral Projected Gradient Methods on Convex Sets journal January 2000
Consistency of the Maximum Likelihood Estimator in the Presence of Infinitely Many Incidental Parameters journal December 1956
$rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation journal November 2006
A global reference for human genetic variation journal January 2015
The performance of standard and hybrid em algorithms for ml estimates of the normal mixture model with censoring journal December 1992
On Convergence Properties of the EM Algorithm for Gaussian Mixtures journal January 1996
Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm journal June 2008
The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm book January 2000
Maximum Likelihood from Incomplete Data Via the EM Algorithm journal September 1977
False discovery rates: a new deal journal October 2016
Defining the role of common variation in the genomic and biological architecture of adult human height journal October 2014
Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences journal August 2004
Nonparametric Maximum Likelihood Estimation of a Mixing Distribution journal December 1978
Tackling Box-Constrained Optimization via a New Projected Quasi-Newton Approach journal January 2010
JuMP: A Modeling Language for Mathematical Optimization journal January 2017
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions text January 2009
Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means journal August 2009
General maximum likelihood empirical Bayes estimation of normal means journal August 2009
Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences text January 2004
Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means text January 2009
A Stochastic Approximation Method journal September 1951
Efficient projections onto the l 1 -ball for learning in high dimensions conference January 2008

Cited By (1)



Similar Records

Large-scale sequential quadratic programming algorithms
Technical Report · Tue Sep 01 00:00:00 EDT 1992 · OSTI ID:1660711

Large-scale sequential quadratic programming algorithms
Technical Report · Tue Sep 01 00:00:00 EDT 1992 · OSTI ID:1660711

Sequential quadratic programming algorithms for optimization
Technical Report · Tue Aug 01 00:00:00 EDT 1989 · OSTI ID:1660711

Related Subjects