Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures

Deveci, Mehmet; Trott, Christian; Rajamanickam, Sivasankaran

doi:10.1016/j.parco.2018.06.009

Title: Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures

Journal Article · Mon Jul 09 00:00:00 EDT 2018 · Parallel Computing

DOI:https://doi.org/10.1016/j.parco.2018.06.009· OSTI ID:1466997

Deveci, Mehmet ^[1]; Trott, Christian ^[1]; Rajamanickam, Sivasankaran ^[1]

Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

Sparse matrix-matrix multiplication is a key kernel that has applications in several domains such as scientific computing and graph analysis. Several algorithms have been studied in the past for this foundational kernel. In this work, we develop parallel algorithms for sparse matrix-matrix multiplication with a focus on performance portability across different high performance computing architectures. The performance of these algorithms depend on the data structures used in them. We compare different types of accumulators in these algorithms and demonstrate the performance difference between these data structures. Furthermore, we develop a meta-algorithm, kkSpGEMM, to choose the right algorithm and data structure based on the characteristics of the problem. We show performance comparisons on three architectures and demonstrate the need for the community to develop two phase sparse matrix-matrix multiplication implementations for efficient reuse of the data structures involved.

View Accepted Manuscript (DOE)

View Accepted Manuscript (Publisher)

Cite

Export

Save

Research Organization:: Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

Sponsoring Organization:: USDOE National Nuclear Security Administration (NNSA)

Grant/Contract Number:: AC04-94AL85000; NA-0003525

OSTI ID:: 1466997

Alternate ID(s):: OSTI ID: 1548025

Report Number(s):: SAND2017-13679J; 659607

Journal Information:: Parallel Computing, Vol. 78, Issue C; ISSN 0167-8191

Publisher:: ElsevierCopyright Statement

Country of Publication:: United States

Language:: English

Citation Metrics:

Cited by: 24 works

Citation information provided by
Web of Science

References (10)

Towards Extreme-Scale Simulations for Low Mach Fluids with Second-Generation Trilinos Lin, Paul; Bettencourt, Matthew; Domino, Stefan Parallel Processing Letters, Vol. 24, Issue 04 https://doi.org/10.1142/S0129626414420055	journal	December 2014
Two Fast Algorithms for Sparse Matrices: Multiplication and Permuted Transposition Gustavson, Fred G. ACM Transactions on Mathematical Software, Vol. 4, Issue 3 https://doi.org/10.1145/355791.355796	journal	September 1978
GPU-Accelerated Sparse Matrix-Matrix Multiplication by Iterative Row Merging Gremse, Felix; Höfter, Andreas; Schwen, Lars Ole SIAM Journal on Scientific Computing, Vol. 37, Issue 1 https://doi.org/10.1137/130948811	journal	January 2015
Optimizing Sparse Matrix—Matrix Multiplication for the GPU Dalton, Steven; Olson, Luke; Bell, Nathan ACM Transactions on Mathematical Software, Vol. 41, Issue 4 https://doi.org/10.1145/2699470	journal	October 2015
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns Carter Edwards, H.; Trott, Christian R.; Sunderland, Daniel Journal of Parallel and Distributed Computing, Vol. 74, Issue 12 https://doi.org/10.1016/j.jpdc.2014.07.003	journal	December 2014
The Combinatorial BLAS: design, implementation, and applications Buluç, Aydın; Gilbert, John R. The International Journal of High Performance Computing Applications, Vol. 25, Issue 4 https://doi.org/10.1177/1094342011403516	journal	May 2011
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication Azad, Ariful; Ballard, Grey; Buluç, Aydin SIAM Journal on Scientific Computing, Vol. 38, Issue 6 https://doi.org/10.1137/15M104253X	journal	January 2016
Simultaneous Input and Output Matrix Partitioning for Outer-Product--Parallel Sparse Matrix-Matrix Multiplication Akbudak, Kadir; Aykanat, Cevdet SIAM Journal on Scientific Computing, Vol. 36, Issue 5 https://doi.org/10.1137/13092589X	journal	January 2014
Exploiting Locality in Sparse Matrix-Matrix Multiplication on Many-Core Architectures Akbudak, Kadir; Aykanat, Cevdet IEEE Transactions on Parallel and Distributed Systems, Vol. 28, Issue 8 https://doi.org/10.1109/TPDS.2017.2656893	journal	August 2017
Brief Announcement: Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication Ballard, Grey; Druinsky, Alex; Knight, Nicholas Proceedings of the 27th ACM on Symposium on Parallelism in Algorithms and Architectures - SPAA '15 https://doi.org/10.1145/2755573.2755613	conference	January 2015

Cited By (3)

Adaptive sparse matrix-matrix multiplication on the GPU Winter, Martin; Mlakar, Daniel; Zayer, Rhaleb PPoPP '19: 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming https://doi.org/10.1145/3293883.3295701	conference	February 2019
Register-Aware Optimizations for Parallel Sparse Matrix–Matrix Multiplication Liu, Junhong; He, Xin; Liu, Weifeng International Journal of Parallel Programming, Vol. 47, Issue 3 https://doi.org/10.1007/s10766-018-0604-8	journal	January 2019
Preparing sparse solvers for exascale computing Anzt, Hartwig; Boman, Erik; Falgout, Rob Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, Vol. 378, Issue 2166 https://doi.org/10.1098/rsta.2019.0053	journal	January 2020

Similar Records

Multi-threaded Sparse Matrix Sparse Matrix Multiplication for Many-Core and GPU Architectures.

Technical Report · Mon Jan 01 00:00:00 EST 2018 · OSTI ID:1466997

Deveci, Mehmet; Trott, Christian Robert; Rajamanickam, Sivasankaran

Sparse Matrix-Matrix Multiplication on Multilevel Memory Architectures: Algorithms and Experiments

Technical Report · Mon Apr 02 00:00:00 EDT 2018 · OSTI ID:1466997

Deveci, Mehmet; Hammond, Simon David; Wolf, Michael M.; +1 more

Performance optimization, modeling and analysis of sparse matrix-matrix products on multi-core and many-core processors

Journal Article · Fri Aug 30 00:00:00 EDT 2019 · Parallel Computing · OSTI ID:1466997

Nagasaka, Yusuke; Matsuoka, Satoshi; Azad, Ariful; +1 more

Related Subjects

97 MATHEMATICS AND COMPUTING
Sparse matrix sparse matrix multiplication
KNLs
GPUs
SpGEMM

Title: Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures

Citation Formats

References (10)

Cited By (3)

Similar Records

Related Subjects