Communication-Avoiding and Memory-Constrained Sparse Matrix-Matrix Multiplication at Extreme Scale
- Indiana Univ., Bloomington, IN (United States)
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in various graph, scientific computing and machine learning algorithms. In this paper, we consider SpGEMMs performed on hundreds of thousands of processors generating trillions of nonzeros in the output matrix. Distributed SpGEMM at this extreme scale faces two key challenges: (1) high communication cost and (2) inadequate memory to generate the output. Furthermore, we address these challenges with an integrated communication-avoiding and memory-constrained SpGEMM algorithm that scales to 262,144 cores (more than 1 million hardware threads) and can multiply sparse matrices of any size as long as inputs and a fraction of output fit in the aggregated memory. As we go from 16,384 cores to 262,144 cores on a Cray XC40 supercomputer, the new SpGEMM algorithm runs 10x faster when multiplying large-scale protein-similarity matrices.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC02-05CH11231
- OSTI ID:
- 1817306
- Journal Information:
- Proceedings - IEEE International Parallel and Distributed Processing Symposium (IPDPS), Vol. 2021; Conference: 2021 IEEE International Symposium on Parallel and Distributed Processing (IPDPS), Portland, OR (United States), 17-21 May 2021; ISSN 1530-2075
- Publisher:
- IEEECopyright Statement
- Country of Publication:
- United States
- Language:
- English
The parallelism motifs of genomic data analysis
|
journal | January 2020 |
Similar Records
High-Performance Sparse Matrix-Matrix Products on Intel KNL and Multicore Architectures
A matrix-algebraic formulation of distributed-memory maximal cardinality matching algorithms in bipartite graphs
Related Subjects
proteins
three-dimensional displays
social networking
scientific computing
memory management
genomics
parallel processing
graph theory
mathematics computing
matrix algebra
matrix multiplication
multiprocessing systems
parallel machines
resource allocations
sparse matrices