Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Parallel Memory-Independent Communication Bounds for SYRK

Conference · · Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures

In this paper, we focus on the parallel communication cost of multiplying a matrix with its transpose, known as a symmetric rank-k update (SYRK). SYRK requires half the computation of general matrix multiplication because of the symmetry of the output matrix. Recent work (Beaumont et al., SPAA '22) has demonstrated that the sequential I/O complexity of SYRK is also a constant factor smaller than that of general matrix multiplication. Inspired by this progress, we establish memory-independent parallel communication lower bounds for SYRK with smaller constants than general matrix multiplication, and we show that these constants are tight by presenting communication-optimal algorithms. The crux of the lower bound proof relies on extending a key geometric inequality to symmetric computations and analytically solving a constrained nonlinear optimization problem. Here, the optimal algorithms use a triangular blocking scheme for parallel distribution of the symmetric output matrix and corresponding computation.

Research Organization:
Wake Forest University, Winston-Salem, NC (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF); European Research Council (ERC)
DOE Contract Number:
SC0023296; CCF-1942892; OAC-2106920; 810367
Journal Information:
Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, Conference: SPAA '23: 35. ACM Symposium on Parallelism in Algorithms and Architectures, Orlando, FL (United States), 17-19 Jun 2023
Country of Publication:
United States

References (7)

Communication lower bounds for distributed-memory matrix multiplication journal September 2004
Minimizing Communication in Numerical Linear Algebra journal July 2011
Convex Optimization book January 2004
ScaLAPACK Users' Guide book January 1997
Efficient algorithms for all-to-all communications in multiport message-passing systems journal January 1997
An inequality related to the isoperimetric inequality journal January 1949
A Tight I/O Lower Bound for Matrix Multiplication journal May 2020