Efficient Breadth First Search on Multi-GPU Systems Using GPU-Centric OpenSHMEM
- NVIDIA Corporation, Santa Clara, CA (United States)
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
NVSHMEM is an implementation of OpenSHMEM for NVIDIA GPUs which allows communication to be issued from inside CUDA kernels. In this work, we present an implementation of Breadth First Search for multi-GPU systems using NVSHMEM. We analyze the benefits and bottlenecks of moving fine-grained communication into CUDA kernels. Using our implementation of BFS, we achieve up to 75% improvement in performance compared to a CUDA-aware MPI-based implementation, in the best case.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- OSTI ID:
- 1567474
- Country of Publication:
- United States
- Language:
- English
Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs
|
conference | October 2013 |
Exploring OpenSHMEM Model to Program GPU-based Extreme-Scale Systems
|
book | January 2015 |
GPU programming in a high level language: compiling X10 to CUDA
|
conference | January 2011 |
Parallel distributed breadth first search on GPU
|
conference | December 2013 |
Making TSUBAME2.0, the world's greenest production supercomputer, even greener — Challenges to the architects
|
conference | August 2011 |
MPI-ACC: An Integrated and Extensible Approach to Data Movement in Accelerator-based Systems
|
conference | June 2012 |
Extending OpenSHMEM for GPU Computing
|
conference | May 2013 |
Parallel Distributed Breadth First Search on the Kepler Architecture
|
journal | July 2016 |
Scalable GPU graph traversal
|
journal | September 2012 |
FLAT: a GPU programming framework to provide embedded MPI
|
conference | January 2012 |
Similar Records
GPU-Centric Communication on NVIDIA GPU Clusters with InfiniBand: A Case Study with OpenSHMEM
Using Hybrid Model OpenSHMEM + CUDA to Implement the SHOC Benchmark Suite
Establish the basis for Breadth-First Search on Frontier System: XBFS on AMD GPUs
Conference
·
Thu Nov 30 23:00:00 EST 2017
·
OSTI ID:1427708
Using Hybrid Model OpenSHMEM + CUDA to Implement the SHOC Benchmark Suite
Conference
·
Thu Aug 04 00:00:00 EDT 2016
·
OSTI ID:1567410
Establish the basis for Breadth-First Search on Frontier System: XBFS on AMD GPUs
Conference
·
Fri Nov 01 00:00:00 EDT 2024
·
OSTI ID:2584525