Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Parallel k-means++ for Multiple Shared-Memory Architectures

Conference ·
DOI:https://doi.org/10.1109/ICPP.2016.18· OSTI ID:1334876

In recent years k-means++ has become a popular initialization technique for improved k-means clustering. To date, most of the work done to improve its performance has involved parallelizing algorithms that are only approximations of k-means++. In this paper we present a parallelization of the exact k-means++ algorithm, with a proof of its correctness. We develop implementations for three distinct shared-memory architectures: multicore CPU, high performance GPU, and the massively multithreaded Cray XMT platform. We demonstrate the scalability of the algorithm on each platform. In addition we present a visual approach for showing which platform performed k-means++ the fastest for varying data sizes.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1334876
Report Number(s):
PNNL-SA-116273
Country of Publication:
United States
Language:
English

Similar Records

Parallel k-means++
Software · Mon Apr 03 20:00:00 EDT 2017 · OSTI ID:code-54998

A Highly Parallel Implementation of K-Means for Multithreaded Architecture
Conference · Wed Apr 06 00:00:00 EDT 2011 · OSTI ID:1030877

Aho-Corasick String Matching on Shared and Distributed Memory Parallel Architectures
Journal Article · Wed Feb 29 23:00:00 EST 2012 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1034574