Simple, direct and efficient multi-way spectral clustering
- Department of Computer Science, Cornell University, Gates Hall, Ithaca, NY
- Center for Computational Biology, Flatiron Institute, Fifth Avenue, New York, NY
- Department of Mathematics and Institute for Computational & Mathematical Engineering, Stanford University, Serra Mall, Bldg, Stanford, CA
Abstract We present a new algorithm for spectral clustering based on a column-pivoted QR factorization that may be directly used for cluster assignment or to provide an initial guess for k-means. Our algorithm is simple to implement, direct and requires no initial guess. Furthermore, it scales linearly in the number of nodes of the graph and a randomized variant provides significant computational gains. Provided the subspace spanned by the eigenvectors used for clustering contains a basis that resembles the set of indicator vectors on the clusters, we prove that both our deterministic and randomized algorithms recover a basis close to the indicators in Frobenius norm. We also experimentally demonstrate that the performance of our algorithm tracks recent information theoretic bounds for exact recovery in the stochastic block model. Finally, we explore the performance of our algorithm when applied to a real-world graph.
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- FG02-97ER25308; SC0009409
- OSTI ID:
- 1457488
- Journal Information:
- Information and Inference (Online), Journal Name: Information and Inference (Online) Journal Issue: 1 Vol. 8; ISSN 2049-8772
- Publisher:
- Oxford University PressCopyright Statement
- Country of Publication:
- United Kingdom
- Language:
- English
Similar Records
Simulated Half-Precision Implementation of Blocked QR Factorization and Graph Clustering Applications
Inverse Subspace Iteration for Spectral Stochastic Finite Element Methods