Parallel k-means++

doi:10.11578/dc.20210416.80

Parallel k-means++

Software · Mon Apr 03 20:00:00 EDT 2017

DOI:https://doi.org/10.11578/dc.20210416.80· OSTI ID:code-54998 · Code ID:54998

A parallelization of the k-means++ seed selection algorithm on three distinct hardware platforms: GPU, multicore CPU, and multithreaded architecture. K-means++ was developed by David Arthur and Sergei Vassilvitskii in 2007 as an extension of the k-means data clustering technique. These algorithms allow people to cluster multidimensional data, by attempting to minimize the mean distance of data points within a cluster. K-means++ improved upon traditional k-means by using a more intelligent approach to selecting the initial seeds for the clustering process. While k-means++ has become a popular alternative to traditional k-means clustering, little work has been done to parallelize this technique. We have developed original C++ code for parallelizing the algorithm on three unique hardware architectures: GPU using NVidia's CUDA/Thrust framework, multicore CPU using OpenMP, and the Cray XMT multithreaded architecture. By parallelizing the process for these platforms, we are able to perform k-means++ clustering much more quickly than it could be done before.

Short Name / Acronym:: Parallel k-means++

Site Accession Number:: 7426; Battelle IPID 31119

Software Type:: Scientific

License(s):: Other (Commercial or Open-Source)

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

Primary Award/Contract Number:

AC05-76RL01830

DOE Contract Number:: AC05-76RL01830

Code ID:: 54998

OSTI ID:: code-54998

Country of Origin:: United States

Similar Records

Parallel k-means++ for Multiple Shared-Memory Architectures

Conference · Thu Sep 22 00:00:00 EDT 2016 · OSTI ID:1334876

A Highly Parallel Implementation of K-Means for Multithreaded Architecture

Conference · Wed Apr 06 00:00:00 EDT 2011 · OSTI ID:1030877

Parallelizing the Unpacking and Clustering of Detector Data for Reconstruction of Charged Particle Tracks on Multi-core CPUs and Many-core GPUs

Journal Article · Tue Jan 26 23:00:00 EST 2021 · TBD · OSTI ID:1823525

Parallel k-means++

Citation Formats

Similar Records

Related Subjects