Portable high performance GEMM-based level 3 BLAS
Conference
·
OSTI ID:54425
- Univ. of Umea (Sweden)
- Cornell Univ., Ithaca, NY (United States)
The Level 3 Basic Linear Algebra subprograms (BLAS) are designed to perform various matrix multiply and triangular system solving computations. The development of optimal Level 3 BLAS code is costly and time consuming, because it requires assembly level programming/thinking. However, it is possible to develop a portable and high performance Level 3 BLAS only relying on an optimized GEMM, the BLAS subprogram for the general matrix multiply and add operation. With suitable partitioning, all the other Level 3 BLA subprograms can be defined in terms of GEMM and a negligible amount of Level 1 and 2 computations. Performance results of our portable GEMM-Based library for double precision real data are presented for various target architectures.
- OSTI ID:
- 54425
- Report Number(s):
- DOE/ER/25151--1-Vol.1; CONF-930331--Vol.1; CNN: Grant NUTEK 89-02578P
- Country of Publication:
- United States
- Language:
- English
Similar Records
ChatBLAS: The First AI-Generated and Portable BLAS Library
PB-BLAS: A set of parallel block basic linear algebra subprograms
Threaded Multi-Core GEMM with MoA and Cache-Blocking: Preprint
Conference
·
Fri Nov 01 00:00:00 EDT 2024
·
OSTI ID:3002546
PB-BLAS: A set of parallel block basic linear algebra subprograms
Conference
·
Fri Dec 30 23:00:00 EST 1994
·
OSTI ID:78703
Threaded Multi-Core GEMM with MoA and Cache-Blocking: Preprint
Conference
·
Mon Feb 28 23:00:00 EST 2022
·
OSTI ID:1848079