Use of Level 3 BLAS in LU factorization in a multiprocessing environment on three vector multiprocessors: The ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF
The authors study various implementations of block Gaussian elimination on full matrices and examine their performance on three parallel computers, the ALLIANT FX/80, the CRAY-2, and the IBM 3090-400/VF. These implementations are expressed in terms of Level 3 BLAS matrix-matrix kernels. The authors consider the use of parallel Level 3 BLAS kernels and compare the parallelism obtained within the computational kernels with that obtained when parallelizing over the kernels. They show that the use of parallel Level 3 BLAS allows portability without sacrifice of efficiency, even in a parallel environment, and that high speeds can be obtained if tuned versions of the kernels are available. (Copyright (c) Science and Engineering Research Council 1990.)
- Research Organization:
- Rutherford Appleton Lab., Chilton (United Kingdom)
- OSTI ID:
- 5217043
- Report Number(s):
- PB-91-225193/XAB; RAL--90-083
- Country of Publication:
- United States
- Language:
- English
Similar Records
Level 3 blas in LU factorization on the CRAY-2, ETA-10P, and IBM 3090-200/VF
Vectorization of a multiprocessor multifrontal code
Related Subjects
990200* -- Mathematics & Computers
ALGORITHMS
ARRAY PROCESSORS
COMPARATIVE EVALUATIONS
COMPUTERS
CRAY COMPUTERS
DIGITAL COMPUTERS
EVALUATION
FACTORIZATION
FORTRAN
IBM COMPUTERS
MATHEMATICAL LOGIC
MATRICES
PARALLEL PROCESSING
PERFORMANCE
PROGRAMMING
PROGRAMMING LANGUAGES
SUPERCOMPUTERS
VECTOR PROCESSING