QR factorization of a dense matrix on a shared-memory multiprocessor
Abstract
A new algorithm for computing an orthogonal decomposition of a rectangular m x n matrix A on a shared-memory parallel computer is described. The algorithm uses Givens rotations, and has the feature that its synchronization cost is low. In particular, for a multiprocessor having p processors, an analysis of the algorithm shows that this cost is O (n/sup 2//p) if m/p greater than or equal to n, and O (mn/p/sup 2/) if m/p < n. Note that in the latter case, the synchronization cost is smaller than O (n/sup 2//p). Therefore, the synchronization cost of the algorithm proposed in this article is bounded by O (n/sup 2//p) when m greater than or equal to n. This is important for machines where synchronization cost is high, and when m >> n. Analysis and experiments show that the algorithm is effective in balancing the load and producing high efficiency (speed-up). 13 refs.
- Authors:
- Publication Date:
- Research Org.:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- OSTI Identifier:
- 5928811
- Report Number(s):
- ORNL/TM-10581
ON: DE88001506
- DOE Contract Number:
- AC05-84OR21400
- Resource Type:
- Technical Report
- Resource Relation:
- Other Information: Portions of this document are illegible in microfiche products. Original copy available until stock is exhausted
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; MATRICES; FACTORIZATION; PARALLEL PROCESSING; ALGORITHMS; ARRAY PROCESSORS; COST ESTIMATION; EFFICIENCY; MEMORY MANAGEMENT; PERFORMANCE; MATHEMATICAL LOGIC; PROGRAMMING; 990210* - Supercomputers- (1987-1989)
Citation Formats
Chu, E., and George, A. QR factorization of a dense matrix on a shared-memory multiprocessor. United States: N. p., 1987.
Web. doi:10.2172/5928811.
Chu, E., & George, A. QR factorization of a dense matrix on a shared-memory multiprocessor. United States. https://doi.org/10.2172/5928811
Chu, E., and George, A. 1987.
"QR factorization of a dense matrix on a shared-memory multiprocessor". United States. https://doi.org/10.2172/5928811. https://www.osti.gov/servlets/purl/5928811.
@article{osti_5928811,
title = {QR factorization of a dense matrix on a shared-memory multiprocessor},
author = {Chu, E. and George, A.},
abstractNote = {A new algorithm for computing an orthogonal decomposition of a rectangular m x n matrix A on a shared-memory parallel computer is described. The algorithm uses Givens rotations, and has the feature that its synchronization cost is low. In particular, for a multiprocessor having p processors, an analysis of the algorithm shows that this cost is O (n/sup 2//p) if m/p greater than or equal to n, and O (mn/p/sup 2/) if m/p < n. Note that in the latter case, the synchronization cost is smaller than O (n/sup 2//p). Therefore, the synchronization cost of the algorithm proposed in this article is bounded by O (n/sup 2//p) when m greater than or equal to n. This is important for machines where synchronization cost is high, and when m >> n. Analysis and experiments show that the algorithm is effective in balancing the load and producing high efficiency (speed-up). 13 refs.},
doi = {10.2172/5928811},
url = {https://www.osti.gov/biblio/5928811},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Oct 01 00:00:00 EDT 1987},
month = {Thu Oct 01 00:00:00 EDT 1987}
}