skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Density-Aware Clustering Based on Aggregated Heat Kernel and Its Transformation

Journal Article · · ACM Transactions on Knowledge Discovery from Data
DOI:https://doi.org/10.1145/2700385· OSTI ID:1210172
 [1];  [2];  [2];  [1]
  1. Stony Brook Univ., NY (United States). Computer Sciences Dept.
  2. Brookhaven National Lab. (BNL), Upton, NY (United States). Computational Science Center

Current spectral clustering algorithms suffer from the sensitivity to existing noise, and parameter scaling, and may not be aware of different density distributions across clusters. If these problems are left untreated, the consequent clustering results cannot accurately represent true data patterns, in particular, for complex real world datasets with heterogeneous densities. This paper aims to solve these problems by proposing a diffusion-based Aggregated Heat Kernel (AHK) to improve the clustering stability, and a Local Density Affinity Transformation (LDAT) to correct the bias originating from different cluster densities. AHK statistically\ models the heat diffusion traces along the entire time scale, so it ensures robustness during clustering process, while LDAT probabilistically reveals local density of each instance and suppresses the local density bias in the affinity matrix. Our proposed framework integrates these two techniques systematically. As a result, not only does it provide an advanced noise-resisting and density-aware spectral mapping to the original dataset, but also demonstrates the stability during the processing of tuning the scaling parameter (which usually controls the range of neighborhood). Furthermore, our framework works well with the majority of similarity kernels, which ensures its applicability to many types of data and problem domains. The systematic experiments on different applications show that our proposed algorithms outperform state-of-the-art clustering algorithms for the data with heterogeneous density distributions, and achieve robust clustering performance with respect to tuning the scaling parameter and handling various levels and types of noise.

Research Organization:
Brookhaven National Laboratory (BNL), Upton, NY (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
Grant/Contract Number:
SC00112704
OSTI ID:
1210172
Report Number(s):
BNL-108231-2015-JA
Journal Information:
ACM Transactions on Knowledge Discovery from Data, Vol. 9, Issue 4; ISSN 1556-4681
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English
Citation Metrics:
Cited by: 2 works
Citation information provided by
Web of Science

References (36)

OPTICS: ordering points to identify the clustering structure
  • Ankerst, Mihael; Breunig, Markus M.; Kriegel, Hans-Peter
  • Proceedings of the 1999 ACM SIGMOD international conference on Management of data - SIGMOD '99 https://doi.org/10.1145/304182.304187
conference January 1999
Vector-valued Laplace Transforms and Cauchy Problems book January 2011
Fast approximated power iteration subspace tracking journal August 2005
Robust anisotropic diffusion journal March 1998
LOF: identifying density-based local outliers
  • Breunig, Markus M.; Kriegel, Hans-Peter; Ng, Raymond T.
  • Proceedings of the 2000 ACM SIGMOD international conference on Management of data - SIGMOD '00 https://doi.org/10.1145/342009.335388
conference January 2000
Spectral clustering based on the graph p -Laplacian conference June 2009
Robust Support Vector Regression for Biophysical Variable Estimation From Remotely Sensed Images journal July 2006
Robust path-based spectral clustering journal January 2008
Multiple Model Regression Estimation journal July 2005
On evolutionary spectral clustering journal November 2009
Diffusion maps journal July 2006
Mean shift: a robust approach toward feature space analysis journal May 2002
Locally-scaled spectral clustering using empty region graphs conference August 2012
Robust clustering methods: a unified view journal May 1997
Estimates of heat kernels on Riemannian manifolds book September 1999
Rock: A robust clustering algorithm for categorical attributes journal July 2000
Local anomaly descriptor conference October 2012
A Robust Clustering Algorithm Based on Aggregated Heat Kernel Mapping conference December 2011
Robust Statistics book January 2009
Clustering Using a Similarity Measure Based on Shared Near Neighbors journal November 1973
Chameleon: hierarchical clustering using dynamic modeling journal January 1999
Data Fusion and Multicue Data Matching by Diffusion Maps journal November 2006
Noise Robust Spectral Clustering conference January 2007
A tutorial on spectral clustering journal August 2007
The complexity of the matrix eigenproblem conference January 1999
A factorization approach to grouping book January 1998
Clustering and Embedding Using Commute Times journal November 2007
Accurate parameter estimation for star formation history in galaxies using SDSS spectra journal October 2009
Normalized cuts and image segmentation journal January 2000
Discovery of climate indices using clustering
  • Steinbach, Michael; Tan, Pang-Ning; Kumar, Vipin
  • Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '03 https://doi.org/10.1145/956750.956801
conference January 2003
A Concise and Provably Informative Multi‐Scale Signature Based on Heat Diffusion journal July 2009
Revised DBSCAN algorithm to cluster data with dense adjacent clusters journal January 2013
Active spectral clustering via iterative uncertainty reduction
  • Wauthier, Fabian L.; Jojic, Nebojsa; Jordan, Michael I.
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining https://doi.org/10.1145/2339530.2339737
conference August 2012
Noise-resistant invariants of curves journal January 1993
Spectral clustering with density sensitive similarity function journal July 2011
Local density adaptive similarity measurement for spectral clustering journal January 2011

Similar Records

Robust Molecular Predictive Methods for Novel Polymer Discovery and Applications
Technical Report · Thu Jan 21 00:00:00 EST 2021 · OSTI ID:1210172

Distributed sensor coordination for advanced energy systems
Technical Report · Thu Mar 12 00:00:00 EDT 2015 · OSTI ID:1210172

Sloan Digital Sky Survey III photometric quasar clustering: probing the initial conditions of the Universe
Journal Article · Fri May 01 00:00:00 EDT 2015 · Journal of Cosmology and Astroparticle Physics · OSTI ID:1210172

Related Subjects