Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
LAPACK Working Note 111, UTK, http://www.netlib.org/lapack/lawns Optimizing Matrix Multiply using PHiPAC: a Portable,
 

Summary: LAPACK Working Note 111, UTK, http://www.netlib.org/lapack/lawns
Optimizing Matrix Multiply using PHiPAC: a Portable,
High-Performance, ANSI C Coding Methodology
Je Bilmes
, Krste Asanovicy
, Jim Demmelz
, Dominic Lamx
, Chee-Whye Chin {
August 8, 1996
Abstract
BLAS3 operations have great potential for aggressive optimization. Unfortunately, they
usually need to be hand-coded for a speci c machine and compiler to achieve near-peak per-
formance. We have developed a methodology whereby near-peak performance on a wide range
of systems can be achieved automatically for such routines. First, by analyzing current ma-
chines and C compilers, we've developed guidelines for writing Portable, High-Performance,
ANSI C (PHiPAC, pronounced \fee-pack"). Second, rather than code by hand, we produce
parameterized code generators. Third, we write search scripts that nd the best parameters
for a given system. We report on a BLAS GEMM compatible multi-level cache-blocked matrix
multiply generator that produces code achieving performance in excess of 90% of peak on the
Sparcstation-20/61, IBM RS/6000-590, HP 712/80i, and 80% of peak on the SGI Indigo R4k.

  

Source: Asanovic, Krste - Computer Science and Artificial Intelligence Laboratory & Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (MIT)

 

Collections: Computer Technologies and Information Sciences