Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Fast tree-based algorithms for DBSCAN for low-dimensional data on GPUs

Conference ·
DBSCAN is a well-known density-based clustering algorithm to discover arbitrary shape clusters. While conceptually simple in serial, the algorithm is challenging to efficiently parallelize on manycore GPU architectures. Common pitfalls, such as asynchronous range query calls, result in high thread execution divergence in many implementations. In this paper, we propose a new framework for GPU-accelerated DBSCAN, and describe two tree-based algorithms within that framework. Both algorithms fuse the search for neighbors with updating cluster information, but differ in their treatment of dense regions of the data. We show that the time taken to compute clusters is at most twice that of determination of the neighbors. We compare the proposed algorithms with existing CPU and GPU implementations, and demonstrate their competitiveness and performance using a fast traversal structure (bounding volume hierarchy) for low dimensional data. We also show that the memory usage can be reduced by processing object neighbors dynamically without storing them.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
2000431
Country of Publication:
United States
Language:
English

References (27)

Theoretically-Efficient and Practical Parallel DBSCAN conference May 2020
The Anatomy of Mr. Scan: A Dissection of Performance of an Extreme Scale GPU-Based Clustering Algorithm conference November 2014
Hpdbscan conference November 2015
Hybrid CPU/GPU clustering in shared memory on the billion point scale conference June 2019
RT-DBSCAN: Accelerating DBSCAN using Ray Tracing Hardware conference May 2023
Kokkos 3: Programming Model Extensions for the Exascale Era journal January 2021
A class of algorithms which require nonlinear time to maintain disjoint sets journal April 1979
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases journal September 2001
Memory-efficient implementation of a graphics processor-based cluster detection algorithm for large spatial databases conference May 2010
Ng-Dbscan journal November 2016
A Survey on Bounding Volume Hierarchies for Ray Tracing journal May 2021
Utilizing many-core accelerators for halo and center finding within a cosmology simulation conference October 2015
Rp-Dbscan conference May 2018
R-trees: a dynamic index structure for spatial searching conference January 1984
A Communication Efficient Parallel DBSCAN Algorithm based on Parameter Server conference November 2017
Predicting Taxi–Passenger Demand Using Streaming Data journal September 2013
Multidimensional binary search trees used for associative searching journal September 1975
On the Hardness and Approximation of Euclidean DBSCAN journal July 2017
HACC: Simulating sky surveys on state-of-the-art supercomputing architectures journal January 2016
ArborX: A Performance Portable Geometric Search Library journal January 2021
Density-based clustering using graphics processors conference November 2009
A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU journal April 2019
Exact, Fast and Scalable Parallel DBSCAN for Commodity Platforms conference January 2017
CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs conference December 2021
A high-performance connected components implementation for GPUs conference June 2018
Mr. Scan
  • Welton, Benjamin; Samanas, Evan; Miller, Barton P.
  • Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/2503210.2503262
conference November 2013
BD-CATS: big data clustering at trillion particle scale
  • Patwary, Md. Mostofa Ali; Dubey, Pradeep; Byna, Suren
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807616
conference January 2015

Similar Records

A single-tree algorithm to compute the Euclidean minimum spanning tree on GPUs
Conference · Sat Dec 31 23:00:00 EST 2022 · Proceedings of the International Conference on Parallel Processing · OSTI ID:1922321

Revising Apetrei’s bounding volume hierarchy construction algorithm to allow stackless traversal
Technical Report · Thu Feb 01 23:00:00 EST 2024 · OSTI ID:2301619

Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters
Conference · Wed Nov 01 00:00:00 EDT 2023 · OSTI ID:2438981

Related Subjects