Parallel k-means++ for Multiple Shared-Memory Architectures
In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varying data sizes.
- Research Organization:
- Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC05-76RL01830
- OSTI ID:
- 1334876
- Report Number(s):
- PNNL-SA-116273
- Resource Relation:
- Conference: 45th International Conference on Parallel Processing (ICPP 2016), August 16-19, 2016, Philadelphia, PA, 93-102
- Country of Publication:
- United States
- Language:
- English
Similar Records
Parallel k-means++
Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Software
·
Tue Apr 04 00:00:00 EDT 2017
·
OSTI ID:1334876
Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures
Journal Article
·
Thu Mar 01 00:00:00 EST 2012
· IEEE Transactions on Parallel and Distributed Systems
·
OSTI ID:1334876
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Journal Article
·
Fri Nov 30 00:00:00 EST 2012
· International Journal of High Performance Computing Applications, 26 (4 ):413-430
·
OSTI ID:1334876
+2 more