DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Simple, direct and efficient multi-way spectral clustering

Abstract

Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.

Authors:
 [1];  [2];  [3]
  1. Department of Computer Science, Cornell University, Gates Hall, Ithaca, NY
  2. Center for Computational Biology, Flatiron Institute, Fifth Avenue, New York, NY
  3. Department of Mathematics and Institute for Computational & Mathematical Engineering, Stanford University, Serra Mall, Bldg, Stanford, CA
Publication Date:
Sponsoring Org.:
USDOE
OSTI Identifier:
1457488
Grant/Contract Number:  
FG02-97ER25308; FC02-13ER26134; SC0009409
Resource Type:
Published Article
Journal Name:
Information and Inference (Online)
Additional Journal Information:
Journal Name: Information and Inference (Online) Journal Volume: 8 Journal Issue: 1; Journal ID: ISSN 2049-8772
Publisher:
Oxford University Press
Country of Publication:
United Kingdom
Language:
English

Citation Formats

Damle, Anil, Minden, Victor, and Ying, Lexing. Simple, direct and efficient multi-way spectral clustering. United Kingdom: N. p., 2018. Web. doi:10.1093/imaiai/iay008.
Damle, Anil, Minden, Victor, & Ying, Lexing. Simple, direct and efficient multi-way spectral clustering. United Kingdom. https://doi.org/10.1093/imaiai/iay008
Damle, Anil, Minden, Victor, and Ying, Lexing. Wed . "Simple, direct and efficient multi-way spectral clustering". United Kingdom. https://doi.org/10.1093/imaiai/iay008.
@article{osti_1457488,
title = {Simple, direct and efficient multi-way spectral clustering},
author = {Damle, Anil and Minden, Victor and Ying, Lexing},
abstractNote = {Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.},
doi = {10.1093/imaiai/iay008},
journal = {Information and Inference (Online)},
number = 1,
volume = 8,
place = {United Kingdom},
year = {Wed Jun 27 00:00:00 EDT 2018},
month = {Wed Jun 27 00:00:00 EDT 2018}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record
https://doi.org/10.1093/imaiai/iay008

Save / Share:

Works referenced in this record:

The geometry of kernelized spectral clustering
journal, April 2015

  • Schiebinger, Geoffrey; Wainwright, Martin J.; Yu, Bin
  • The Annals of Statistics, Vol. 43, Issue 2
  • DOI: 10.1214/14-AOS1283

LAPACK Users' Guide
software, January 1999


Least squares quantization in PCM
journal, March 1982


Decay Properties of Spectral Projectors with Applications to Electronic Structure
journal, January 2013

  • Benzi, Michele; Boito, Paola; Razouk, Nader
  • SIAM Review, Vol. 55, Issue 1
  • DOI: 10.1137/100814019

Exact Recovery in the Stochastic Block Model
journal, January 2016

  • Abbe, Emmanuel; Bandeira, Afonso S.; Hall, Georgina
  • IEEE Transactions on Information Theory, Vol. 62, Issue 1
  • DOI: 10.1109/TIT.2015.2490670

Spectral clustering and the high-dimensional stochastic blockmodel
journal, August 2011

  • Rohe, Karl; Chatterjee, Sourav; Yu, Bin
  • The Annals of Statistics, Vol. 39, Issue 4
  • DOI: 10.1214/11-AOS887

CUR matrix decompositions for improved data analysis
journal, January 2009

  • Mahoney, Michael W.; Drineas, Petros
  • Proceedings of the National Academy of Sciences, Vol. 106, Issue 3
  • DOI: 10.1073/pnas.0803205106

Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization
journal, July 1996

  • Gu, Ming; Eisenstat, Stanley C.
  • SIAM Journal on Scientific Computing, Vol. 17, Issue 4
  • DOI: 10.1137/0917055

Semidefinite programs on sparse random graphs and their application to community detection
conference, January 2016

  • Montanari, Andrea; Sen, Subhabrata
  • Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing - STOC 2016
  • DOI: 10.1145/2897518.2897548

Compressed Representation of Kohn–Sham Orbitals via Selected Columns of the Density Matrix
journal, March 2015

  • Damle, Anil; Lin, Lin; Ying, Lexing
  • Journal of Chemical Theory and Computation, Vol. 11, Issue 4
  • DOI: 10.1021/ct500985f

A comparative study of efficient initialization methods for the k-means clustering algorithm
journal, January 2013

  • Celebi, M. Emre; Kingravi, Hassan A.; Vela, Patricio A.
  • Expert Systems with Applications, Vol. 40, Issue 1
  • DOI: 10.1016/j.eswa.2012.07.021

An Introduction to Matrix Concentration Inequalities
journal, January 2015

  • Tropp, Joel A.
  • Foundations and Trends® in Machine Learning, Vol. 8, Issue 1-2
  • DOI: 10.1561/2200000048

Lower Bounds for the Partitioning of Graphs
journal, September 1973

  • Donath, W. E.; Hoffman, A. J.
  • IBM Journal of Research and Development, Vol. 17, Issue 5
  • DOI: 10.1147/rd.175.0420

Spectral redemption in clustering sparse networks
journal, November 2013

  • Krzakala, F.; Moore, C.; Mossel, E.
  • Proceedings of the National Academy of Sciences, Vol. 110, Issue 52
  • DOI: 10.1073/pnas.1312486110

Computing Localized Representations of the Kohn--Sham Subspace Via Randomization and Refinement
journal, January 2017

  • Damle, Anil; Lin, Lin; Ying, Lexing
  • SIAM Journal on Scientific Computing, Vol. 39, Issue 6
  • DOI: 10.1137/16M1098589

Some metric inequalities in the space of matrices
journal, January 1955


Partitioning into Expanders
conference, January 2014

  • Gharan, Shayan Oveis; Trevisan, Luca
  • Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms
  • DOI: 10.1137/1.9781611973402.93

Computing the Polar Decomposition—with Applications
journal, October 1986

  • Higham, Nicholas J.
  • SIAM Journal on Scientific and Statistical Computing, Vol. 7, Issue 4
  • DOI: 10.1137/0907079

Stochastic blockmodels: First steps
journal, June 1983


Sharp nonasymptotic bounds on the norm of random matrices with independent entries
journal, July 2016

  • Bandeira, Afonso S.; van Handel, Ramon
  • The Annals of Probability, Vol. 44, Issue 4
  • DOI: 10.1214/15-AOP1025

On Rank-Revealing Factorisations
journal, April 1994

  • Chandrasekaran, Shivkumar; Ipsen, Ilse C. F.
  • SIAM Journal on Matrix Analysis and Applications, Vol. 15, Issue 2
  • DOI: 10.1137/S0895479891223781

A BLAS-3 Version of the QR Factorization with Column Pivoting
journal, September 1998

  • Quintana-Ortí, Gregorio; Sun, Xiaobai; Bischof, Christian H.
  • SIAM Journal on Scientific Computing, Vol. 19, Issue 5
  • DOI: 10.1137/S1064827595296732

The Rotation of Eigenvectors by a Perturbation. III
journal, March 1970

  • Davis, Chandler; Kahan, W. M.
  • SIAM Journal on Numerical Analysis, Vol. 7, Issue 1
  • DOI: 10.1137/0707001

New Perturbation Bounds for the Unitary Polar Factor
journal, January 1995


A tutorial on spectral clustering
journal, August 2007


Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery
conference, October 2015

  • Abbe, Emmanuel; Sandon, Colin
  • 2015 IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS)
  • DOI: 10.1109/FOCS.2015.47

Algebraic connectivity of graphs [Algebraic connectivity of graphs]
journal, January 1973


Achieving Exact Cluster Recovery Threshold via Semidefinite Programming
journal, May 2016

  • Hajek, Bruce; Wu, Yihong; Xu, Jiaming
  • IEEE Transactions on Information Theory, Vol. 62, Issue 5
  • DOI: 10.1109/TIT.2016.2546280

Fast monte-carlo algorithms for finding low-rank approximations
journal, November 2004


Linear least squares solutions by householder transformations
journal, June 1965

  • Businger, Peter; Golub, Gene H.
  • Numerische Mathematik, Vol. 7, Issue 3
  • DOI: 10.1007/BF01436084

Achieving exact cluster recovery threshold via semidefinite programming
conference, June 2015

  • Hajek, Bruce; Wu, Yihong; Xu, Jiaming
  • 2015 IEEE International Symposium on Information Theory (ISIT)
  • DOI: 10.1109/ISIT.2015.7282694