Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sparse matrix-matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this work, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.
- Research Organization:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC04-94AL85000; NA-0003525
- OSTI ID:
- 1466997
- Alternate ID(s):
- OSTI ID: 1548025
- Report Number(s):
- SAND2017-13679J; 659607
- Journal Information:
- Parallel Computing, Vol. 78, Issue C; ISSN 0167-8191
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Web of Science
Adaptive sparse matrix-matrix multiplication on the GPU
|
conference | February 2019 |
Register-Aware Optimizations for Parallel Sparse Matrix–Matrix Multiplication
|
journal | January 2019 |
Preparing sparse solvers for exascale computing
|
journal | January 2020 |
Similar Records
Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures: Algorithms and Experiments
Performance optimization, modeling and analysis of sparse matrix-matrix products on multi-core and many-core processors