Optimizing matrix operations on a parallel multiprocessor with a memory hierarchy
Memory organizations of supercomputers (CRAY 2, CEDAR) tend to become more and more complex, and correspondingly data management in these memories becomes a crucial factor for achieving high performance. We study here an architecture combining vector and parallel capabilities on a two-level shared memory structure. For this class of architecture, we analyze and optimize matrix multiplication algorithms so as to obtain high efficiency kernels which can be used for many numerical algorithms such as LU and Cholesky factorizations, as well as Gram-Schmidt and Householder orthogonal factorization schemes. The performance of such kernels on the Alliant FX/8 multiprocessor, as well as their application to the Gram-Schmidt procedure is described in detail.
- Research Organization:
- Illinois Univ., Urbana (USA). Center for Supercomputing Research and Development
- DOE Contract Number:
- FG02-85ER25001
- OSTI ID:
- 7204163
- Report Number(s):
- DOE/ER/25001-20; CSRD-555; ON: DE87002116
- Country of Publication:
- United States
- Language:
- English
Similar Records
A multiprocessor scheme for the singular value decomposition
A multiprocessor algorithm for the symmetric tridiagonal Eigenvalue problem
Related Subjects
71 CLASSICAL AND QUANTUM MECHANICS
GENERAL PHYSICS
99 GENERAL AND MISCELLANEOUS
990200* -- Mathematics & Computers
COMPUTERS
DATA
EXPERIMENTAL DATA
INFORMATION
MATRICES
MEMORY DEVICES
NUMERICAL DATA
PARALLEL PROCESSING
PERFORMANCE TESTING
PROGRAMMING
TESTING
VECTOR PROCESSING