The design and implementation of the parallel out-of-core ScaLAPACK LU, QR and Cholesky factorization routines

D`Azevedo, E F; Dongarra, J J

doi:10.2172/296722

Title: The design and implementation of the parallel out-of-core ScaLAPACK LU, QR and Cholesky factorization routines

Technical Report · Tue Apr 01 00:00:00 EST 1997

DOI:https://doi.org/10.2172/296722· OSTI ID:296722

D`Azevedo, E F; Dongarra, J J

This paper describes the design and implementation of three core factorization routines--LU, QR and Cholesky--included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. An image of the full matrix is maintained on disk and the factorization routines transfer sub-matrices into memory. The left-looking column-oriented variant of the factorization algorithm is implemented to reduce the disk I/O traffic. The routines are implemented using a portable I/O interface and utilize high performance ScaLAPACK factorization routines as in-core computational kernels. The authors present the details of the implementation for the out-of-core ScaLAPACK factorization routines, as well as performance and scalability results on the Intel Paragon.

View Technical Report

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE, Washington, DC (United States); USDOE Office of Energy Research, Washington, DC (United States); National Science Foundation, Washington, DC (United States); Defense Advanced Research Projects Agency, Arlington, VA (United States)

DOE Contract Number:: AC05-96OR22464

OSTI ID:: 296722

Report Number(s):: ORNL/TM-13372; R&D Project: 4AC; ON: DE98054626; BR: 11A400301; CNN: Grant ASC-9005933; Contract DAAL03-91-C-0047; Agreement CCR-8809615; TRN: AHC29903%%120

Resource Relation:: Other Information: PBD: Apr 1997

Country of Publication:: United States

Language:: English

Similar Records

Packed storage extension for ScaLAPACK

Technical Report · Wed Jan 01 00:00:00 EST 1997 · OSTI ID:296722

D'Azevedo, E F; Dongarra, J J

Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs

Journal Article · Fri Jul 01 00:00:00 EDT 2016 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:296722

Kurzak, Jakub; Anzt, Hartwig; Gates, Mark; +1 more

Scalability issues affecting the design of a dense linear algebra library

Journal Article · Thu Sep 01 00:00:00 EDT 1994 · Journal of Parallel and Distributed Computing; (United States) · OSTI ID:296722

Dongarra, J J; Geijn, R.A. van de; Walker, D W

Related Subjects

99 MATHEMATICS
COMPUTERS
INFORMATION SCIENCE
MANAGEMENT
LAW
MISCELLANEOUS
FACTORIZATION
S CODES
MATRICES
ALGORITHMS
PERFORMANCE
SUPERCOMPUTERS
MEMORY MANAGEMENT
ARRAY PROCESSORS

Title: The design and implementation of the parallel out-of-core ScaLAPACK LU, QR and Cholesky factorization routines

Citation Formats

Similar Records

Related Subjects