Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

General Matrix-Matrix Multiplication Using SIMD Features of the PIII

Summary: General Matrix-Matrix Multiplication Using
SIMD Features of the PIII
Douglas Aberdeen Douglas.Aberdeen@anu.edu.au
Jonathan Baxter Jonathan.Baxter@anu.edu.au
Research School of Information Sciences and Engineering
Australian National University
Abstract. Generalised matrix-matrix multiplication forms the kernel of
many mathematical algorithms. A faster matrix-matrix multiply imme-
diately benets these algorithms. In this paper we implement ecient
matrix multiplication for large matrices using the oating point Intel
SIMD (Single Instruction Multiple Data) architecture. A description of
the issues and our solution is presented, paying attention to all levels of
the memory hierarchy. Our results demonstrate an average performance
of 2.09 times faster than the leading public domain matrix-matrix mul-
tiply routines.
1 Introduction
A range of applications such as articial neural networks benet from GEMM
(generalised matrix-matrix) multiply routines that run as fast as possible. The
challenge is to use the CPU's peak oating point performance when memory
access is fundamentally slow. The SSE (SIMD Streaming Extensions) instruc-


Source: Aberdeen, Douglas - National ICT Australia & Computer Sciences Laboratory, Australian National University


Collections: Computer Technologies and Information Sciences