Parallel Memory-Independent Communication Bounds for SYRK
- Rutherford Appleton Laboratory, Didcot (United Kingdom)
- Wake Forest University, Winston-Salem, NC (United States)
- Inria Paris (France)
- Inria Lyon (France)
- Inmar Intelligence, Winston-Salem, NC (United States)
In this paper, we focus on the parallel communication cost of multiplying a matrix with its transpose, known as a symmetric rank-k update (SYRK). SYRK requires half the computation of general matrix multiplication because of the symmetry of the output matrix. Recent work (Beaumont et al., SPAA '22) has demonstrated that the sequential I/O complexity of SYRK is also a constant factor smaller than that of general matrix multiplication. Inspired by this progress, we establish memory-independent parallel communication lower bounds for SYRK with smaller constants than general matrix multiplication, and we show that these constants are tight by presenting communication-optimal algorithms. The crux of the lower bound proof relies on extending a key geometric inequality to symmetric computations and analytically solving a constrained nonlinear optimization problem. Here, the optimal algorithms use a triangular blocking scheme for parallel distribution of the symmetric output matrix and corresponding computation.
- Research Organization:
- Wake Forest University, Winston-Salem, NC (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF); European Research Council (ERC)
- DOE Contract Number:
- SC0023296; CCF-1942892; OAC-2106920; 810367
- OSTI ID:
- 1987853
- Journal Information:
- Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, Conference: SPAA '23: 35. ACM Symposium on Parallelism in Algorithms and Architectures, Orlando, FL (United States), 17-19 Jun 2023
- Country of Publication:
- United States
- Language:
- English
Similar Records
New coding techniques for improved bandwidth utilization