A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming
Abstract
Maximum likelihood estimation of mixture proportions has a long history, and continues to play an important role in modern statistics, including in development of nonparametric empirical Bayes methods. Maximum likelihood of mixture proportions has traditionally been solved using the expectation maximization (EM) algorithm, but recent work by Koenker and Mizera shows that modern convex optimization techniques-in particular, interior point methods-are substantially faster and more accurate than EM. Here, we develop a new solution based on sequential quadratic programming (SQP). It is substantially faster than the interior point method, and just as accurate. Our approach combines several ideas: first, it solves a reformulation of the original problem; second, it uses an SQP approach to make the best use of the expensive gradient and Hessian computations; third, the SQP iterations are implemented using an active set method to exploit the sparse nature of the quadratic subproblems; fourth, it uses accurate low-rank approximations for more efficient gradient and Hessian computations. We illustrate the benefits of the SQP approach in experiments on synthetic datasets and a large genetic association dataset. In large datasets (n approximate to 106observations,m approximate to 103mixture components), our implementation achieves at least 100-fold reduction in runtime compared with a state-of-the-artmore »
- Authors:
-
- Univ. of Chicago, IL (United States)
- Univ. of Chicago, IL (United States); Argonne National Lab. (ANL), Lemont, IL (United States)
- Publication Date:
- Research Org.:
- Argonne National Laboratory (ANL), Argonne, IL (United States)
- Sponsoring Org.:
- National Science Foundation (NSF); National Institutes of Health (NIH); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
- OSTI Identifier:
- 1660711
- Grant/Contract Number:
- AC02-06CH11357
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Journal of Computational and Graphical Statistics
- Additional Journal Information:
- Journal Volume: 29; Journal Issue: 2; Journal ID: ISSN 1061-8600
- Publisher:
- Taylor & Francis
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Kim, Youngseok, Carbonetto, Peter, Stephens, Matthew, and Anitescu, Mihai. A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming. United States: N. p., 2020.
Web. doi:10.1080/10618600.2019.1689985.
Kim, Youngseok, Carbonetto, Peter, Stephens, Matthew, & Anitescu, Mihai. A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming. United States. https://doi.org/10.1080/10618600.2019.1689985
Kim, Youngseok, Carbonetto, Peter, Stephens, Matthew, and Anitescu, Mihai. Wed .
"A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming". United States. https://doi.org/10.1080/10618600.2019.1689985. https://www.osti.gov/servlets/purl/1660711.
@article{osti_1660711,
title = {A Fast Algorithm for Maximum Likelihood Estimation of Mixture Proportions using Sequential Quadratic Programming},
author = {Kim, Youngseok and Carbonetto, Peter and Stephens, Matthew and Anitescu, Mihai},
abstractNote = {Maximum likelihood estimation of mixture proportions has a long history, and continues to play an important role in modern statistics, including in development of nonparametric empirical Bayes methods. Maximum likelihood of mixture proportions has traditionally been solved using the expectation maximization (EM) algorithm, but recent work by Koenker and Mizera shows that modern convex optimization techniques-in particular, interior point methods-are substantially faster and more accurate than EM. Here, we develop a new solution based on sequential quadratic programming (SQP). It is substantially faster than the interior point method, and just as accurate. Our approach combines several ideas: first, it solves a reformulation of the original problem; second, it uses an SQP approach to make the best use of the expensive gradient and Hessian computations; third, the SQP iterations are implemented using an active set method to exploit the sparse nature of the quadratic subproblems; fourth, it uses accurate low-rank approximations for more efficient gradient and Hessian computations. We illustrate the benefits of the SQP approach in experiments on synthetic datasets and a large genetic association dataset. In large datasets (n approximate to 106observations,m approximate to 103mixture components), our implementation achieves at least 100-fold reduction in runtime compared with a state-of-the-art interior point solver. Our methods are implemented in Julia and in an R package available on CRAN (). Supplementary materials for this article are available online.},
doi = {10.1080/10618600.2019.1689985},
journal = {Journal of Computational and Graphical Statistics},
number = 2,
volume = 29,
place = {United States},
year = {Wed Jan 08 00:00:00 EST 2020},
month = {Wed Jan 08 00:00:00 EST 2020}
}
Web of Science
Works referenced in this record:
Mixture Densities, Maximum Likelihood and the EM Algorithm
journal, April 1984
- Redner, Richard A.; Walker, Homer F.
- SIAM Review, Vol. 26, Issue 2
A Stochastic Quasi-Newton Method for Large-Scale Optimization
journal, January 2016
- Byrd, R. H.; Hansen, S. L.; Nocedal, Jorge
- SIAM Journal on Optimization, Vol. 26, Issue 2
Contributions to the Mathematical Theory of Evolution
journal, January 1894
- Pearson, K.
- Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 185, Issue 0
Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules
journal, April 2014
- Koenker, Roger; Mizera, Ivan
- Journal of the American Statistical Association, Vol. 109, Issue 506
Convex Optimization, Shape Constraints, Compound Decisions, and Empirical Bayes Rules
journal, April 2014
- Koenker, Roger; Mizera, Ivan
- Journal of the American Statistical Association, Vol. 109, Issue 506
Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions
journal, January 2011
- Halko, N.; Martinsson, P. G.; Tropp, J. A.
- SIAM Review, Vol. 53, Issue 2
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
journal, November 2016
- MacArthur, Jacqueline; Bowler, Emily; Cerezo, Maria
- Nucleic Acids Research, Vol. 45, Issue D1
Deconvolution of a Distribution Function
journal, December 1997
- Cordy, Clifford B.; Thomas, David R.
- Journal of the American Statistical Association, Vol. 92, Issue 440
REBayes : An R Package for Empirical Bayes Mixture Methods
journal, January 2017
- Koenker, Roger; Gu, Jiaying
- Journal of Statistical Software, Vol. 82, Issue 8
Second-Order Stochastic Optimization for Machine Learning in Linear Time
text, January 2016
- Agarwal, Naman; Bullins, Brian; Hazan, Elad
- arXiv
Interior-point methods
journal, December 2000
- Potra, Florian A.; Wright, Stephen J.
- Journal of Computational and Applied Mathematics, Vol. 124, Issue 1-2
Julia: A Fast Dynamic Language for Technical Computing
preprint, January 2012
- Bezanson, Jeff; Karpinski, Stefan; Shah, Viral B.
- arXiv
Nonmonotone Spectral Projected Gradient Methods on Convex Sets
journal, January 2000
- Birgin, Ernesto G.; MartÃnez, José Mario; Raydan, Marcos
- SIAM Journal on Optimization, Vol. 10, Issue 4
Consistency of the Maximum Likelihood Estimator in the Presence of Infinitely Many Incidental Parameters
journal, December 1956
- Kiefer, J.; Wolfowitz, J.
- The Annals of Mathematical Statistics, Vol. 27, Issue 4
$rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
journal, November 2006
- Aharon, M.; Elad, M.; Bruckstein, A.
- IEEE Transactions on Signal Processing, Vol. 54, Issue 11
REBayes : An R Package for Empirical Bayes Mixture Methods
journal, January 2017
- Koenker, Roger; Gu, Jiaying
- Journal of Statistical Software, Vol. 82, Issue 8
The performance of standard and hybrid em algorithms for ml estimates of the normal mixture model with censoring
journal, December 1992
- Atkinson, Scott E.
- Journal of Statistical Computation and Simulation, Vol. 44, Issue 1-2
On Convergence Properties of the EM Algorithm for Gaussian Mixtures
journal, January 1996
- Xu, Lei; Jordan, Michael I.
- Neural Computation, Vol. 8, Issue 1
Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm
journal, June 2008
- Varadhan, Ravi; Roland, Christophe
- Scandinavian Journal of Statistics, Vol. 35, Issue 2
The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm
book, January 2000
- Andersen, Erling D.; Andersen, Knud D.
- Applied Optimization
Maximum Likelihood from Incomplete Data Via the EM Algorithm
journal, September 1977
- Dempster, A. P.; Laird, N. M.; Rubin, D. B.
- Journal of the Royal Statistical Society: Series B (Methodological), Vol. 39, Issue 1
Defining the role of common variation in the genomic and biological architecture of adult human height
journal, October 2014
- Wood, Andrew R.; Esko, Tonu; Yang, Jian
- Nature Genetics, Vol. 46, Issue 11
Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences
journal, August 2004
- Johnstone, Iain M.; Silverman, Bernard W.
- The Annals of Statistics, Vol. 32, Issue 4
Nonparametric Maximum Likelihood Estimation of a Mixing Distribution
journal, December 1978
- Laird, Nan
- Journal of the American Statistical Association, Vol. 73, Issue 364
Tackling Box-Constrained Optimization via a New Projected Quasi-Newton Approach
journal, January 2010
- Kim, Dongmin; Sra, Suvrit; Dhillon, Inderjit S.
- SIAM Journal on Scientific Computing, Vol. 32, Issue 6
JuMP: A Modeling Language for Mathematical Optimization
journal, January 2017
- Dunning, Iain; Huchette, Joey; Lubin, Miles
- SIAM Review, Vol. 59, Issue 2
Consistency of the Maximum Likelihood Estimator in the Presence of Infinitely Many Incidental Parameters
journal, December 1956
- Kiefer, J.; Wolfowitz, J.
- The Annals of Mathematical Statistics, Vol. 27, Issue 4
Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
text, January 2009
- Halko, Nathan; Martinsson, Per-Gunnar; Tropp, Joel A.
- arXiv
Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means
journal, August 2009
- Brown, Lawrence D.; Greenshtein, Eitan
- The Annals of Statistics, Vol. 37, Issue 4
General maximum likelihood empirical Bayes estimation of normal means
journal, August 2009
- Jiang, Wenhua; Zhang, Cun-Hui
- The Annals of Statistics, Vol. 37, Issue 4
Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences
text, January 2004
- Johnstone, Iain M.; Silverman, Bernard W.
- arXiv
Nonparametric empirical Bayes and compound decision approaches to estimation of a high-dimensional vector of normal means
text, January 2009
- Brown, Lawrence D.; Greenshtein, Eitan
- arXiv
A Stochastic Approximation Method
journal, September 1951
- Robbins, Herbert; Monro, Sutton
- The Annals of Mathematical Statistics, Vol. 22, Issue 3
Efficient projections onto the l 1 -ball for learning in high dimensions
conference, January 2008
- Duchi, John; Shalev-Shwartz, Shai; Singer, Yoram
- Proceedings of the 25th international conference on Machine learning - ICML '08
Works referencing / citing this record:
Solving the Empirical Bayes Normal Means Problem with Correlated Noise
preprint, January 2018
- Sun, Lei; Stephens, Matthew
- arXiv