DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Simple, direct and efficient multi-way spectral clustering

Journal Article · · Information and Inference (Online)
 [1];  [2];  [3]
  1. Department of Computer Science, Cornell University, Gates Hall, Ithaca, NY
  2. Center for Computational Biology, Flatiron Institute, Fifth Avenue, New York, NY
  3. Department of Mathematics and Institute for Computational & Mathematical Engineering, Stanford University, Serra Mall, Bldg, Stanford, CA

Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.

Sponsoring Organization:
USDOE
Grant/Contract Number:
FG02-97ER25308; SC0009409
OSTI ID:
1457488
Journal Information:
Information and Inference (Online), Journal Name: Information and Inference (Online) Journal Issue: 1 Vol. 8; ISSN 2049-8772
Publisher:
Oxford University PressCopyright Statement
Country of Publication:
United Kingdom
Language:
English

References (31)

Linear least squares solutions by householder transformations journal June 1965
A tutorial on spectral clustering journal August 2007
Stochastic blockmodels: First steps journal June 1983
A comparative study of efficient initialization methods for the k-means clustering algorithm journal January 2013
Compressed Representation of Kohn–Sham Orbitals via Selected Columns of the Density Matrix journal March 2015
CUR matrix decompositions for improved data analysis journal January 2009
Spectral redemption in clustering sparse networks journal November 2013
Some metric inequalities in the space of matrices journal January 1955
Community Detection in General Stochastic Block models: Fundamental Limits and Efficient Algorithms for Recovery conference October 2015
Achieving exact cluster recovery threshold via semidefinite programming conference June 2015
Least squares quantization in PCM journal March 1982
Exact Recovery in the Stochastic Block Model journal January 2016
Achieving Exact Cluster Recovery Threshold via Semidefinite Programming journal May 2016
The Rotation of Eigenvectors by a Perturbation. III journal March 1970
Computing the Polar Decomposition—with Applications journal October 1986
Efficient Algorithms for Computing a Strong Rank-Revealing QR Factorization journal July 1996
LAPACK Users' Guide software January 1999
Partitioning into Expanders conference January 2014
Decay Properties of Spectral Projectors with Applications to Electronic Structure journal January 2013
Computing Localized Representations of the Kohn--Sham Subspace Via Randomization and Refinement journal January 2017
On Rank-Revealing Factorisations journal April 1994
New Perturbation Bounds for the Unitary Polar Factor journal January 1995
A BLAS-3 Version of the QR Factorization with Column Pivoting journal September 1998
Fast monte-carlo algorithms for finding low-rank approximations journal November 2004
Semidefinite programs on sparse random graphs and their application to community detection conference January 2016
Lower Bounds for the Partitioning of Graphs journal September 1973
Spectral clustering and the high-dimensional stochastic blockmodel journal August 2011
The geometry of kernelized spectral clustering journal April 2015
Sharp nonasymptotic bounds on the norm of random matrices with independent entries journal July 2016
An Introduction to Matrix Concentration Inequalities journal January 2015
Algebraic connectivity of graphs [Algebraic connectivity of graphs] journal January 1973