Parallel conjugate gradient: effects of ordering strategies, programming paradigms, and architectural platforms
The Conjugate Gradient (CG) algorithm is perhaps the best-known iterative technique to solve sparse linear systems that are symmetric and positive definite. A sparse matrix-vector multiply (SPMV) usually accounts for most of the floating-point operations with a CG iteration. In this paper, we investigate the effects of various ordering and partitioning strategies on the performance of parallel CG and SPMV using different programming and architectures. Results show that for this class of applications, ordering significantly improves overall performance, that cache reuse may be more important than reducing communication, and that it is possible to achieve message passing performance using shared memory constructs through careful data ordering and distribution. However, a multithreaded implementation of CG on the Tera MTA does not require special ordering or partitioning to obtain high efficiency and scalability.
- Research Organization:
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Director, Office of Science. Office of Advanced Scientific Computing Research. Mathematical, Information, and Computational Sciences Division; National Aeronautics and Space Administration (US)
- DOE Contract Number:
- AC03-76SF00098
- OSTI ID:
- 775130
- Report Number(s):
- LBNL-45828; R&D Project: 618310; TRN: AH200110%%117
- Resource Relation:
- Conference: 13th International Conference on Parallel and Distributed Computing Systems, Las Vegas, NV (US), 08/08/2000--08/10/2000; Other Information: PBD: 1 May 2000
- Country of Publication:
- United States
- Language:
- English
Similar Records
Ordering unstructured meshes for sparse matrix computations on leading parallel systems
Ordering schemes for sparse matrices using modern programming paradigms