Simple, direct and efficient multi-way spectral clustering
Abstract
Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.
- Authors:
-
- Department of Computer Science, Cornell University, Gates Hall, Ithaca, NY
- Center for Computational Biology, Flatiron Institute, Fifth Avenue, New York, NY
- Department of Mathematics and Institute for Computational & Mathematical Engineering, Stanford University, Serra Mall, Bldg, Stanford, CA
- Publication Date:
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1457488
- Grant/Contract Number:
- FG02-97ER25308; FC02-13ER26134; SC0009409
- Resource Type:
- Published Article
- Journal Name:
- Information and Inference (Online)
- Additional Journal Information:
- Journal Name: Information and Inference (Online) Journal Volume: 8 Journal Issue: 1; Journal ID: ISSN 2049-8772
- Publisher:
- Oxford University Press
- Country of Publication:
- United Kingdom
- Language:
- English
Citation Formats
Damle, Anil, Minden, Victor, and Ying, Lexing. Simple, direct and efficient multi-way spectral clustering. United Kingdom: N. p., 2018.
Web. doi:10.1093/imaiai/iay008.
Damle, Anil, Minden, Victor, & Ying, Lexing. Simple, direct and efficient multi-way spectral clustering. United Kingdom. https://doi.org/10.1093/imaiai/iay008
Damle, Anil, Minden, Victor, and Ying, Lexing. Wed .
"Simple, direct and efficient multi-way spectral clustering". United Kingdom. https://doi.org/10.1093/imaiai/iay008.
@article{osti_1457488,
title = {Simple, direct and efficient multi-way spectral clustering},
author = {Damle, Anil and Minden, Victor and Ying, Lexing},
abstractNote = {Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.},
doi = {10.1093/imaiai/iay008},
journal = {Information and Inference (Online)},
number = 1,
volume = 8,
place = {United Kingdom},
year = {Wed Jun 27 00:00:00 EDT 2018},
month = {Wed Jun 27 00:00:00 EDT 2018}
}
https://doi.org/10.1093/imaiai/iay008
Works referenced in this record:
The geometry of kernelized spectral clustering
journal, April 2015
- Schiebinger, Geoffrey; Wainwright, Martin J.; Yu, Bin
- The Annals of Statistics, Vol. 43, Issue 2
Least squares quantization in PCM
journal, March 1982
- Lloyd, S.
- IEEE Transactions on Information Theory, Vol. 28, Issue 2
Decay Properties of Spectral Projectors with Applications to Electronic Structure
journal, January 2013
- Benzi, Michele; Boito, Paola; Razouk, Nader
- SIAM Review, Vol. 55, Issue 1
Exact Recovery in the Stochastic Block Model
journal, January 2016
- Abbe, Emmanuel; Bandeira, Afonso S.; Hall, Georgina
- IEEE Transactions on Information Theory, Vol. 62, Issue 1
Spectral clustering and the high-dimensional stochastic blockmodel
journal, August 2011
- Rohe, Karl; Chatterjee, Sourav; Yu, Bin
- The Annals of Statistics, Vol. 39, Issue 4
CUR matrix decompositions for improved data analysis
journal, January 2009
- Mahoney, Michael W.; Drineas, Petros
- Proceedings of the National Academy of Sciences, Vol. 106, Issue 3
Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization
journal, July 1996
- Gu, Ming; Eisenstat, Stanley C.
- SIAM Journal on Scientific Computing, Vol. 17, Issue 4
Semidefinite programs on sparse random graphs and their application to community detection
conference, January 2016
- Montanari, Andrea; Sen, Subhabrata
- Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2016
Compressed Representation of Kohn–Sham Orbitals via Selected Columns of the Density Matrix
journal, March 2015
- Damle, Anil; Lin, Lin; Ying, Lexing
- Journal of Chemical Theory and Computation, Vol. 11, Issue 4
A comparative study of efficient initialization methods for the k-means clustering algorithm
journal, January 2013
- Celebi, M. Emre; Kingravi, Hassan A.; Vela, Patricio A.
- Expert Systems with Applications, Vol. 40, Issue 1
An Introduction to Matrix Concentration Inequalities
journal, January 2015
- Tropp, Joel A.
- Foundations and Trends® in Machine Learning, Vol. 8, Issue 1-2
Lower Bounds for the Partitioning of Graphs
journal, September 1973
- Donath, W. E.; Hoffman, A. J.
- IBM Journal of Research and Development, Vol. 17, Issue 5
Spectral redemption in clustering sparse networks
journal, November 2013
- Krzakala, F.; Moore, C.; Mossel, E.
- Proceedings of the National Academy of Sciences, Vol. 110, Issue 52
Computing Localized Representations of the Kohn--Sham Subspace Via Randomization and Refinement
journal, January 2017
- Damle, Anil; Lin, Lin; Ying, Lexing
- SIAM Journal on Scientific Computing, Vol. 39, Issue 6
Some metric inequalities in the space of matrices
journal, January 1955
- Fan, Ky; Hoffman, A. J.
- Proceedings of the American Mathematical Society, Vol. 6, Issue 1
Partitioning into Expanders
conference, January 2014
- Gharan, Shayan Oveis; Trevisan, Luca
- Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms
Computing the Polar Decomposition—with Applications
journal, October 1986
- Higham, Nicholas J.
- SIAM Journal on Scientific and Statistical Computing, Vol. 7, Issue 4
Stochastic blockmodels: First steps
journal, June 1983
- Holland, Paul W.; Laskey, Kathryn Blackmond; Leinhardt, Samuel
- Social Networks, Vol. 5, Issue 2
Sharp nonasymptotic bounds on the norm of random matrices with independent entries
journal, July 2016
- Bandeira, Afonso S.; van Handel, Ramon
- The Annals of Probability, Vol. 44, Issue 4
On Rank-Revealing Factorisations
journal, April 1994
- Chandrasekaran, Shivkumar; Ipsen, Ilse C. F.
- SIAM Journal on Matrix Analysis and Applications, Vol. 15, Issue 2
A BLAS-3 Version of the QR Factorization with Column Pivoting
journal, September 1998
- Quintana-Ortí, Gregorio; Sun, Xiaobai; Bischof, Christian H.
- SIAM Journal on Scientific Computing, Vol. 19, Issue 5
The Rotation of Eigenvectors by a Perturbation. III
journal, March 1970
- Davis, Chandler; Kahan, W. M.
- SIAM Journal on Numerical Analysis, Vol. 7, Issue 1
New Perturbation Bounds for the Unitary Polar Factor
journal, January 1995
- Li, Ren-Cang
- SIAM Journal on Matrix Analysis and Applications, Vol. 16, Issue 1
A tutorial on spectral clustering
journal, August 2007
- von Luxburg, Ulrike
- Statistics and Computing, Vol. 17, Issue 4
Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery
conference, October 2015
- Abbe, Emmanuel; Sandon, Colin
- 2015 IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS)
Algebraic connectivity of graphs [Algebraic connectivity of graphs]
journal, January 1973
- Fiedler, Miroslav
- Czechoslovak Mathematical Journal, Vol. 23, Issue 2
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
journal, May 2016
- Hajek, Bruce; Wu, Yihong; Xu, Jiaming
- IEEE Transactions on Information Theory, Vol. 62, Issue 5
Fast monte-carlo algorithms for finding low-rank approximations
journal, November 2004
- Frieze, Alan; Kannan, Ravi; Vempala, Santosh
- Journal of the ACM, Vol. 51, Issue 6
Linear least squares solutions by householder transformations
journal, June 1965
- Businger, Peter; Golub, Gene H.
- Numerische Mathematik, Vol. 7, Issue 3
Achieving exact cluster recovery threshold via semidefinite programming
conference, June 2015
- Hajek, Bruce; Wu, Yihong; Xu, Jiaming
- 2015 IEEE International Symposium on Information Theory (ISIT)