Density-Aware Clustering Based on Aggregated Heat Kernel and Its Transformation
Journal Article
·
· ACM Transactions on Knowledge Discovery from Data
- Stony Brook Univ., NY (United States). Computer Sciences Dept.
- Brookhaven National Lab. (BNL), Upton, NY (United States). Computational Science Center
Current spectral clustering algorithms suffer from the sensitivity to existing noise, and parameter scaling, and may not be aware of different density distributions across clusters. If these problems are left untreated, the consequent clustering results cannot accurately represent true data patterns, in particular, for complex real world datasets with heterogeneous densities. This paper aims to solve these problems by proposing a diffusion-based Aggregated Heat Kernel (AHK) to improve the clustering stability, and a Local Density Affinity Transformation (LDAT) to correct the bias originating from different cluster densities. AHK statistically\ models the heat diffusion traces along the entire time scale, so it ensures robustness during clustering process, while LDAT probabilistically reveals local density of each instance and suppresses the local density bias in the affinity matrix. Our proposed framework integrates these two techniques systematically. As a result, not only does it provide an advanced noise-resisting and density-aware spectral mapping to the original dataset, but also demonstrates the stability during the processing of tuning the scaling parameter (which usually controls the range of neighborhood). Furthermore, our framework works well with the majority of similarity kernels, which ensures its applicability to many types of data and problem domains. The systematic experiments on different applications show that our proposed algorithms outperform state-of-the-art clustering algorithms for the data with heterogeneous density distributions, and achieve robust clustering performance with respect to tuning the scaling parameter and handling various levels and types of noise.
- Research Organization:
- Brookhaven National Laboratory (BNL), Upton, NY (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
- OSTI ID:
- 1210172
- Report Number(s):
- BNL--108231-2015-JA
- Journal Information:
- ACM Transactions on Knowledge Discovery from Data, Journal Name: ACM Transactions on Knowledge Discovery from Data Journal Issue: 4 Vol. 9; ISSN 1556-4681
- Publisher:
- Association for Computing MachineryCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Data depth based clustering analysis
Robust and Simple ADMM Penalty Parameter Selection
Conference
·
Thu Dec 31 23:00:00 EST 2015
·
OSTI ID:1438413
Robust and Simple ADMM Penalty Parameter Selection
Journal Article
·
Tue Jan 09 23:00:00 EST 2024
· IEEE Open Journal of Signal Processing (Online)
·
OSTI ID:2283368