skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Automatic Blocking Of QR and LU Factorizations for Locality

Conference ·

QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, more benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.

Research Organization:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
W-7405-ENG-48
OSTI ID:
15013895
Report Number(s):
UCRL-CONF-203233; TRN: US200803%%1057
Resource Relation:
Conference: Presented at: The Second ACM SIGPLAN Workshop on Memory System Performance, Washington , DC, United States, Jun 08 - Jun 08, 2004
Country of Publication:
United States
Language:
English

Similar Records

Compiler blockability of dense matrix factorizations.
Journal Article · Mon Sep 01 00:00:00 EDT 1997 · ACM Trans. Math. Software · OSTI ID:15013895

The design of linear algebra libraries for high performance computers
Technical Report · Sun Aug 01 00:00:00 EDT 1993 · OSTI ID:15013895

A BLAS-3 version of the QR factorization with column pivoting
Journal Article · Tue Sep 01 00:00:00 EDT 1998 · SIAM Journal on Scientific Computing · OSTI ID:15013895