Improving the Performance of DGEMM with MoA and Cache-Blocking: Preprint

Thomas, Stephen; Mullin, Lenore; Swirydowicz, Katarzyna

Improving the Performance of DGEMM with MoA and Cache-Blocking: Preprint

Conference · Wed Feb 09 04:00:00 EST 2022

OSTI ID:1845269

Thomas, Stephen; Mullin, Lenore; Swirydowicz, Katarzyna

The goal of this paper is to demonstrate performance enhancements of the high performance dense linear algebra matrix-matrix multiply DGEMM kernel, widely implemented by vendors in the basic linear algebra subroutine BLAS library. The mathematics of arrays (MoA) paradigm due to Mullin (1988) results in contiguous memory accesses in combination with Church-Rosser complete language constructs optimized for target processor architectures [3]. Our performance studies demonstrate that the MoA implementation of DGEMM combined with optimal cache-blocking strategies results in at least a 25% performance gain on both Intel Xeon Skylake and IBM Power-9 processors over the vendor supplied Intel MKL and IBM ESSL basic linear algebra libraries. Results are presented for the NREL Eagle and ORNL Summit supercomputers.

Research Organization:: National Renewable Energy Laboratory (NREL), Golden, CO (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)

DOE Contract Number:: AC36-08GO28308

OSTI ID:: 1845269

Report Number(s):: NREL/CP-2C00-80232; MainId:42435; UUID:6e404b9c-0e8d-4c93-a224-30161a0be7be; MainAdminID:63011

Country of Publication:: United States

Language:: English

Similar Records

Threaded Multi-Core GEMM with MoA and Cache-Blocking: Preprint

Conference · Mon Feb 28 23:00:00 EST 2022 · OSTI ID:1848079

Performance Analysis of Memory Transfers and GEMM Subroutines on NVIDIA Tesla GPU Cluster

Conference · Mon Aug 31 00:00:00 EDT 2009 · OSTI ID:965387

A Flexible-blocking Based Approach for Performance Tuning of Matrix Multiplication Routines for Large Matrices with Edge Cases

Conference · Fri Nov 30 23:00:00 EST 2018 · OSTI ID:1557472

Related Subjects

DGEMM
MATHEMATICS AND COMPUTING
MoA
cache-blocking
contiguous memory
mathematics of arrays

Improving the Performance of DGEMM with MoA and Cache-Blocking: Preprint

Citation Formats

Similar Records

Related Subjects