Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering

Conference ·
Recent GPUs have introduced Sparse Tensor Cores (SPTC) to accelerate computations on sparse matrices meeting the N:M sparse patterns. Software tools expand the support to more general V:N:M patterns. Graphs in Graph Neural Networks (GNNs) are typically sparse, but the sparsity is often irregular, not conforming to the required V:N:M sparse patterns. This paper proposes a novel graph reordering algorithm to transform irregular graph data into the required sparse patterns for GNNs to benefit from SPTC. The optimization is lossless, maintaining the accuracy of GNN. It at the same time keeps the symmetry of the adjacency matrices of the graphs so that the same matrices can remain compatible with many symmetry-based graph algorithms. The optimization successfully removes 98-100% violations of the N:M sparse patterns at the vector level and increases the portion of conforming graphs in the SuiteSparse collection from 5-9% to 88.7-93.5%. On A100 GPUs, the optimization accelerates Sparse Matrix Matrix (SpMM) by up to 43X (a geomean speedup of 2.3X - 7.5X) over cuSPARSE and speeds up the key graph operations in GNNs on real graphs by as much as 8.6X (3.5X on average).
Research Organization:
North Carolina State University
Sponsoring Organization:
USDOE Office of Energy Efficiency and Renewable Energy (EERE), Renewable Power Office. Solar Energy Technologies Office
DOE Contract Number:
EE0009357
OSTI ID:
2524569
Country of Publication:
United States
Language:
English

References (29)

Making caches work for graph analytics conference December 2017
SlashBurn: Graph Compression and Mining beyond Caveman Communities journal December 2014
Streamlining GPU applications on the fly: thread divergence elimination through runtime thread-data remapping conference January 2010
Dynamic N:M Fine-Grained Structured Sparse Attention Mechanism conference February 2023
Algorithm 1000: SuiteSparse:GraphBLAS: Graph Algorithms in the Language of Sparse Linear Algebra journal December 2019
A Closer Look at Lightweight Graph Reordering conference November 2019
Performance optimization of irregular codes based on the combination of reordering and blocking techniques journal August 2005
On compressing social networks
  • Chierichetti, Flavio; Kumar, Ravi; Lattanzi, Silvio
  • Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09 https://doi.org/10.1145/1557019.1557049
conference January 2009
When is Graph Reordering an Optimization? Studying the Effect of Lightweight Graph Reordering Across Applications and Input Graphs conference September 2018
Rabbit Order: Just-in-Time Parallel Reordering for Fast Graph Analysis conference May 2016
Bridging the gap between deep learning and sparse matrix format selection conference January 2018
Vertex Reordering for Real-World Graphs and Applications: An Empirical Evaluation conference October 2020
Beyond 'Caveman Communities': Hubs and Spokes for Graph Compression and Mining conference December 2011
DTC-SpMM: Bridging the Gap in Accelerating General Sparse Matrix Multiplication with Tensor Cores
  • Fan, Ruibo; Wang, Wei; Chu, Xiaowen
  • Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3 https://doi.org/10.1145/3620666.3651378
conference April 2024
Jigsaw: Accelerating SpMM with Vector Sparsity on Sparse Tensor Core conference August 2024
Multiscale approach for the network compression-friendly ordering journal June 2011
On-the-fly elimination of dynamic irregularities for GPU computing
  • Zhang, Eddy Z.; Jiang, Yunlian; Guo, Ziyu
  • Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems https://doi.org/10.1145/1950365.1950408
conference March 2011
Speedup Graph Processing by Graph Ordering conference June 2016
BitGNN: Unleashing the Performance Potential of Binary Graph Neural Networks on GPUs conference June 2023
Understanding and bridging the gaps in current GNN performance optimizations conference February 2021
PANE: scalable and effective attributed network embedding journal March 2023
Sparsity: Optimization Framework for Sparse Matrix Kernels journal February 2004
Graph Neural Networks for Social Recommendation conference May 2019
Enabling Runtime SpMV Format Selection through an Overhead Conscious Method journal January 2020
Overhead-Conscious Format Selection for SpMV-Based Applications conference May 2018
Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Heterogeneous Graphs conference August 2022
Dissecting Tensor Cores via Microbenchmarks: Latency, Throughput and Numeric Behaviors journal January 2023
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
  • Castro, Roberto L.; Ivanov, Andrei; Andrade, Diego
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1145/3581784.3607087
conference November 2023
H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture conference August 2022

Similar Records

Accelerating GNNs on GPU Sparse Tensor Cores through N:M Sparsity-Oriented Graph Reordering
Conference · Thu Feb 27 23:00:00 EST 2025 · OSTI ID:2545648

Design Principles for Sparse Matrix Multiplication on the GPU
Conference · Mon Aug 27 00:00:00 EDT 2018 · OSTI ID:1457016

pnnl/emp-gnn
Software · Wed Feb 28 19:00:00 EST 2024 · OSTI ID:code-123162

Related Subjects