Matrix multiplication operations with data preconditioning in a high performance computing architecture
Abstract
Mechanisms for performing matrix multiplication operations with data preconditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
 Inventors:
 Issue Date:
 Research Org.:
 International Business Machines Corp., Armonk, NY (United States)
 Sponsoring Org.:
 USDOE
 OSTI Identifier:
 1107797
 Patent Number(s):
 8577950
 Application Number:
 12/542,255
 Assignee:
 International Business Machines Corporation (Armonk, NY)
 Patent Classifications (CPCs):

G  PHYSICS G06  COMPUTING G06F  ELECTRIC DIGITAL DATA PROCESSING
 DOE Contract Number:
 B554331
 Resource Type:
 Patent
 Country of Publication:
 United States
 Language:
 English
 Subject:
 97 MATHEMATICS AND COMPUTING
Citation Formats
Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Matrix multiplication operations with data preconditioning in a high performance computing architecture. United States: N. p., 2013.
Web.
Eichenberger, Alexandre E, Gschwind, Michael K, & Gunnels, John A. Matrix multiplication operations with data preconditioning in a high performance computing architecture. United States.
Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Tue .
"Matrix multiplication operations with data preconditioning in a high performance computing architecture". United States. https://www.osti.gov/servlets/purl/1107797.
@article{osti_1107797,
title = {Matrix multiplication operations with data preconditioning in a high performance computing architecture},
author = {Eichenberger, Alexandre E and Gschwind, Michael K and Gunnels, John A},
abstractNote = {Mechanisms for performing matrix multiplication operations with data preconditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2013},
month = {11}
}
Works referenced in this record:
High performance software on Intel Pentium Pro processors or MicroOps to TeraFLOPS
conference, January 1997
 Greer, Bruce; Henry, Greg
 Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM)  Supercomputing '97, p. 113
Automatically Tuned Linear Algebra Software
conference, January 1998
 Whaley, R. C.; Dongarra, J. J.
 SC98  High Performance Networking and Computing Conference, Proceedings of the IEEE/ACM SC98 Conference
Adaptive Strassen and ATLAS's DGEMM: a fast squarematrix multiply for modern highperformance systems
conference, January 2005
 D'Alberto, P.; Nicolau, A.
 Eighth International Conference on HighPerformance Computing in AsiaPacific Region (HPCASIA'05)