Gigaflop speed algorithm for the direct solution of large block-tridiagonal systems in 3-D physics applications
Journal Article
·
· Comput. Phys.; (United States)
In the discretization of the 3-D partial differential equations of many physics problems, it is found that the resultant system of linear equations can be represented by a block tridiagonal matrix. Depending on the substructure of the blocks, one can devise many algorithms for the solution of these systems. For plasma physics problems of interest to the authors, several interesting matrix problems arise that should be useful in other applications as well. In one case, where the blocks are dense, it was found that by using a multitasked cyclic reduction procedure, it was possible to reach gigaflop rates on a Cray-2 for the direct solve of these large linear systems. The recently built code PAMS (parallelized matrix solver) embodies this technique and uses fast vendor-supplied routines and obtains this good performance. Manipulations within the blocks are done by these highly optimized linear algebra subroutines that exploit vectorization as well as overlap of the functional units within each CPU. In unitasking mode, speeds well above 340 Mflops have been measured. The cyclic reduction method multitasks quite well with overlap factors in the range of three to four. In multitasking mode, average speeds of 1.1 gigaflops have been measured for the entire PAMS algorithm. In addition to the presentation of the PAMS algorithm, it is shown how related systems having banded blocks may be treated efficiently by multitasked cyclic reduction in the Cray-2 multiprocessor environment. The PAMS method is intended for multiprocessors and would not be a method of choice on a uniprocessor. Furthermore, this method's advantage was found to be critically dependent on the hardware, software, and charging algorithm installed on any given multiprocessor system.
- Research Organization:
- National Magnetic Fusion Energy Computer Center, Lawrence Livermore National Laboratory, Livermore, California 94550
- OSTI ID:
- 6635997
- Journal Information:
- Comput. Phys.; (United States), Journal Name: Comput. Phys.; (United States) Vol. 3:2; ISSN CPHYE
- Country of Publication:
- United States
- Language:
- English
Similar Records
Solution of single linear tridiagonal systems and vectorization of the ICCG algorithm on the Cray 1
Multilevel parallel solver for block tridiagonal and banded linear systems. Technical report
Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors
Technical Report
·
Thu Jun 25 00:00:00 EDT 1981
·
OSTI ID:6314990
Multilevel parallel solver for block tridiagonal and banded linear systems. Technical report
Technical Report
·
Fri Aug 25 00:00:00 EDT 1989
·
OSTI ID:5114438
Optimizing tridiagonal solvers for alternating direction methods on Boolean cube multiprocessors
Journal Article
·
Tue May 01 00:00:00 EDT 1990
· SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (USA)
·
OSTI ID:6411089