 
Summary: The Annals of Applied Probability
2005, Vol. 15, No. 1A, 6992
DOI 10.1214/105051604000000512
© Institute of Mathematical Statistics, 2005
LEARNING MIXTURES OF SEPARATED
NONSPHERICAL GAUSSIANS
BY SANJEEV ARORA1 AND RAVI KANNAN2
Princeton University and Yale University
Mixtures of Gaussian (or normal) distributions arise in a variety of
application areas. Many heuristics have been proposed for the task of finding
the component Gaussians given samples from the mixture, such as the EM
algorithm, a localsearch heuristic from Dempster, Laird and Rubin [J. Roy.
Statist. Soc. Ser. B 39 (1977) 138]. These do not provably run in polynomial
time.
We present the first algorithm that provably learns the component
Gaussians in time that is polynomial in the dimension. The Gaussians may
have arbitrary shape, but they must satisfy a "separation condition" which
places a lower bound on the distance between the centers of any two
component Gaussians. The mathematical results at the heart of our proof are
"distance concentration" resultsproved using isoperimetric inequalities
