skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Parallel k-means++ for Multiple Shared-Memory Architectures

Conference ·
DOI:https://doi.org/10.1109/ICPP.2016.18· OSTI ID:1334876

In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varying data sizes.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1334876
Report Number(s):
PNNL-SA-116273
Resource Relation:
Conference: 45th International Conference on Parallel Processing (ICPP 2016), August 16-19, 2016, Philadelphia, PA, 93-102
Country of Publication:
United States
Language:
English

Similar Records

Parallel k-means++
Software · Tue Apr 04 00:00:00 EDT 2017 · OSTI ID:1334876

Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures
Journal Article · Thu Mar 01 00:00:00 EST 2012 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1334876

Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Journal Article · Fri Nov 30 00:00:00 EST 2012 · International Journal of High Performance Computing Applications, 26 (4 ):413-430 · OSTI ID:1334876