An adaptive blocking strategy for matrix factorizations
On most high-performance architectures, data movement is slow compared to floating-point (in particular, vector) performance. On these architectures block algorithms have been successful for matrix computations. By considering a matrix as a collection of submatrices (the so-called blocks) one naturally arrives at algorithms that require little data movement. The optimal blocking strategy, however, depends on the computing environment and on the problem parameters. Current approaches use fixed-width blocking strategies which are not optimal. This paper presents an adaptive blocking'' methodology for determining in a systematic manner an optimal blocking strategy for a uniprocessor machine. We demonstrate this technique on a block QR factorization routine on a uniprocessor. After generating timing models for the high-level kernels of the algorithm we can formulate the optimal blocking strategy in a recurrence relation that we can solve inexpensively with a dynamic programming technique. Experiments on one processor of a CRAY 2 show that in fact the resulting blocking strategy is as good as any fixed-width blocking strategy. So while we do not know the optimum fixed-width blocking strategy unless we re-run the same problem several times, adaptive blocking provides optimum performance in the very first run. 22 refs., 4 figs.
- Research Organization:
- Argonne National Lab., IL (USA)
- Sponsoring Organization:
- DOE/ER
- DOE Contract Number:
- W-31109-ENG-38
- OSTI ID:
- 6745432
- Report Number(s):
- CONF-900992-2; ON: DE90011108
- Country of Publication:
- United States
- Language:
- English
Similar Records
A block QR factorization algorithm using restricted pivoting
A block QR factorization algorithm for rank-deficient matrices
Automatic Blocking Of QR and LU Factorizations for Locality
Conference
·
Sat Dec 31 23:00:00 EST 1988
·
OSTI ID:5587289
A block QR factorization algorithm for rank-deficient matrices
Conference
·
Sat Dec 31 23:00:00 EST 1988
·
OSTI ID:6938041
Automatic Blocking Of QR and LU Factorizations for Locality
Conference
·
Thu Mar 25 23:00:00 EST 2004
·
OSTI ID:15013895