Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Parallel k-means++

Software ·
DOI:https://doi.org/10.11578/dc.20210416.80· OSTI ID:code-54998 · Code ID:54998

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.

Short Name / Acronym:
Parallel k-means++
Site Accession Number:
7426; Battelle IPID 31119
Software Type:
Scientific
License(s):
Other (Commercial or Open-Source)
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE

Primary Award/Contract Number:
AC05-76RL01830
DOE Contract Number:
AC05-76RL01830
Code ID:
54998
OSTI ID:
code-54998
Country of Origin:
United States

Similar Records

Parallel k-means++ for Multiple Shared-Memory Architectures
Conference · Thu Sep 22 00:00:00 EDT 2016 · OSTI ID:1334876

A Highly Parallel Implementation of K-Means for Multithreaded Architecture
Conference · Wed Apr 06 00:00:00 EDT 2011 · OSTI ID:1030877

Parallelizing the Unpacking and Clustering of Detector Data for Reconstruction of Charged Particle Tracks on Multi-core CPUs and Many-core GPUs
Journal Article · Tue Jan 26 23:00:00 EST 2021 · TBD · OSTI ID:1823525

Related Subjects