DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Abstract

Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.

Inventors:
; ;
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1107797
Patent Number(s):
8577950
Application Number:
12/542,255
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
DOE Contract Number:  
B554331
Resource Type:
Patent
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Matrix multiplication operations with data pre-conditioning in a high performance computing architecture. United States: N. p., 2013. Web.
Eichenberger, Alexandre E, Gschwind, Michael K, & Gunnels, John A. Matrix multiplication operations with data pre-conditioning in a high performance computing architecture. United States.
Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Tue . "Matrix multiplication operations with data pre-conditioning in a high performance computing architecture". United States. https://www.osti.gov/servlets/purl/1107797.
@article{osti_1107797,
title = {Matrix multiplication operations with data pre-conditioning in a high performance computing architecture},
author = {Eichenberger, Alexandre E and Gschwind, Michael K and Gunnels, John A},
abstractNote = {Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2013},
month = {11}
}

Works referenced in this record:

Matrix multiply with reduced bandwidth requirements
patent-application, November 2007


Method and Apparatus for Vector Execution on a Scalar Machine
patent-application, December 2009


Optimized Scalar Promotion with Load and Splat SIMD Instructions
patent-application, December 2009


Multiprocessor for hardware emulation
patent, August 1996


Matrix multiplication in a vector processing system
patent-application, September 2005


Optimized Corner Turns for Local Storage and Bandwidth Reduction
patent-application, November 2009


Preprocessing of stored target routines for emulating incompatible instructions on a target processor
patent, December 1999


Method and apparatus for vector execution on a scalar machine
patent, September 2009


Performing A Multiply-Multiply-Accumulate Instruction
patent-application, July 2013


High performance software on Intel Pentium Pro processors or Micro-Ops to TeraFLOPS
conference, January 1997


Symmetrical multiprocessing bus and chipset used for coprocessor support allowing non-native code to run in a system
patent, October 2001


Transferring data from integer to vector registers
patent-application, March 2007


Method and apparatus for obtaining a scalar value directly from a vector register
patent, February 2005


Vector co-processor for configurable and extensible processor architecture
patent, May 2008


Vector processing system
patent, November 2008


Method and Structure of Using SIMD Vector Architectures to Implement Matrix Multiplication
patent-application, March 2011


Reducing Bandwidth Requirements for Matrix Multiplication
patent-application, December 2009


Complex Matrix Multiplication Operations with Data Pre-Conditioning in a High Performance Computing Architecture
patent-application, February 2011


System and software for performing matrix multiply extract operations
patent, April 2011


Automatically Tuned Linear Algebra Software
conference, January 1998


Vector processor architecture and methods performed therein
patent-application, April 2004


Explicit DST-based filter operating in the DCT domain
patent, September 2000


Adaptive Strassen and ATLAS's DGEMM: a fast square-matrix multiply for modern high-performance systems
conference, January 2005