skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Matrix multiplication operations using pair-wise load and splat operations

Abstract

Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.

Inventors:
; ; ;
Publication Date:
Research Org.:
International Business Machines Corporation, Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1347566
Patent Number(s):
9,600,281
Application Number:
12/834,464
Assignee:
International Business Machines Corporation OSTI
DOE Contract Number:
B554331
Resource Type:
Patent
Resource Relation:
Patent File Date: 2010 Jul 12
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Eichenberger, Alexandre E., Gschwind, Michael K., Gunnels, John A., and Salapura, Valentina. Matrix multiplication operations using pair-wise load and splat operations. United States: N. p., 2017. Web.
Eichenberger, Alexandre E., Gschwind, Michael K., Gunnels, John A., & Salapura, Valentina. Matrix multiplication operations using pair-wise load and splat operations. United States.
Eichenberger, Alexandre E., Gschwind, Michael K., Gunnels, John A., and Salapura, Valentina. Tue . "Matrix multiplication operations using pair-wise load and splat operations". United States. doi:. https://www.osti.gov/servlets/purl/1347566.
@article{osti_1347566,
title = {Matrix multiplication operations using pair-wise load and splat operations},
author = {Eichenberger, Alexandre E. and Gschwind, Michael K. and Gunnels, John A. and Salapura, Valentina},
abstractNote = {Mechanisms for performing a matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A pair-wise load and splat operation is performed to load a pair of scalar values of a second vector operand and replicate the pair of scalar values within a second target vector register. An operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored. This operation may be repeated for a second pair of scalar values of the second vector operand.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Mar 21 00:00:00 EDT 2017},
month = {Tue Mar 21 00:00:00 EDT 2017}
}

Patent:

Save / Share:
  • Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial productmore » of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.« less
  • Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation ismore » performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.« less
  • An apparatus is described for optically performing matrix-matrix multiplication using incoherent light comprising: means for providing a source of pulsed incoherent light; means disposed to intercepting at least a portion of the pulsed light from the incoherent light source providing means for changing the optical properties of the pulsed light; means disposed in an aligned relationship with the changing means for integrating the portion of the pulsed light that the resolution cells of the first element and the second element permit passage thereto, the integrating means has a two-dimensional area architecture sized to equal the area sum of the resolutionmore » cells of one of the elements; and means coupled to the first element and the second element for actuating a simultaneous, mutually orthogonal displacement of the first and second matrix information in synchronization with the pulsing of the pulsed incoherent light source providing means.« less
  • Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert ormore » delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.« less
  • Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert ormore » delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.« less