# Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture

## Abstract

Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.

- Inventors:

- Issue Date:

- Research Org.:
- International Business Machines Corporation, Armonk, NY, USA

- Sponsoring Org.:
- USDOE

- OSTI Identifier:
- 1119675

- Patent Number(s):
- 8,650,240

- Application Number:
- 12/542,324

- Assignee:
- International Business Machines Corporation (Armonk, NY)

- DOE Contract Number:
- B554331

- Resource Type:
- Patent

- Country of Publication:
- United States

- Language:
- English

- Subject:
- 97 MATHEMATICS AND COMPUTING

### Citation Formats

```
Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A.
```*Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture*. United States: N. p., 2014.
Web.

```
Eichenberger, Alexandre E, Gschwind, Michael K, & Gunnels, John A.
```*Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture*. United States.

```
Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Tue .
"Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture". United States. https://www.osti.gov/servlets/purl/1119675.
```

```
@article{osti_1119675,
```

title = {Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture},

author = {Eichenberger, Alexandre E and Gschwind, Michael K and Gunnels, John A},

abstractNote = {Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.},

doi = {},

journal = {},

number = ,

volume = ,

place = {United States},

year = {2014},

month = {2}

}

Works referenced in this record:

##
Vector processor architecture and methods performed therein

patent-application, April 2004

- Demjanenko, Victor
- US Patent Application 10/467225; 20040073773

##
Matrix multiplication in a vector processing system

patent-application, September 2005

- Sazegari, Ali
- US Patent Application 11/113035; 20050193050

##
Transferring data from integer to vector registers

patent-application, March 2007

- Citron, Daniel; Zaks, Ayal
- US Patent Application 11/214348; 20070050598

##
Programmable digital signal processor having a clustered SIMD microarchitecture including a complex short multiplier and an independent vector load unit

patent-application, August 2007

- Liu, Dake; Nilsson, Anders Henrik; Tell, Eric Johan
- US Patent Application 11/201841; 20070198815

##
Matrix multiply with reduced bandwidth requirements

patent-application, November 2007

- Juffa, Norbert; Nickolls, John R.
- US Patent Application 11/430324; 20070271325

##
System and Method for Compiling Scalar Code for a Single Instruction Multiple Data (SIMD) Execution Engine

patent-application, September 2008

- Gschwind, Michael K.
- US Patent Application 12/127857; 20080229066

##
Optimized Corner Turns for Local Storage and Bandwidth Reduction

patent-application, November 2009

- Brokenshire, Daniel A.; Gunnels, John A.; Kistler, Michael D.
- US Patent Application 12/125996; 20090292758

##
Reducing Bandwidth Requirements for Matrix Multiplication

patent-application, December 2009

- Brokenshire, Damiel A.; Gunnels, John A.; Kistler, Michael D.
- US Patent Application 12/129789; 20090300091

##
Optimized Scalar Promotion with Load and Splat SIMD Instructions

patent-application, December 2009

- Eichenberger, Alexandre E.; GSchwind, Michael K.; Gunnels, JOhn A.
- US Patent Application 12/134495; 20090307656

##
Method and Apparatus for Vector Execution on a Scalar Machine

patent-application, December 2009

- Colavin, Osvaldo; Rizzo, Davide; Soni, Vineet
- US Patent Application 12/544250; 20090313458

##
Complex Matrix Multiplication Operations with Data Pre-Conditioning in a High Performance Computing Architecture

patent-application, February 2011

- Eichenberger, Alexandre E.; Gschwind, Michael K.; Gunnels, John A.
- US Patent Application 12/542324; 20110040822

##
Method and Structure of Using SIMD Vector Architectures to Implement Matrix Multiplication

patent-application, March 2011

- Eichenberger, Alexandre E.; Gschwind, Michael Karl; Gunnels, John A.
- US Patent Application 12/548129; 20110055517

##
Performing A Multiply-Multiply-Accumulate Instruction

patent-application, July 2013

- Sprangle, Eric
- US Patent Application 13/783963; 20130179661

##
Processor with Instructions Variable Data Distribution

patent-application, July 2013

- Hung, Ching-Yu; Inamori, Shinri; Sankaran, Jagadeesh
- US Patent Application 13/548933; 20130185544

##
Adaptive Strassen and ATLAS's DGEMM: a fast square-matrix multiply for modern high-performance systems

conference, January 2005

- D'Alberto, P.; Nicolau, A.
- Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05)

##
High performance software on Intel Pentium Pro processors or Micro-Ops to TeraFLOPS

conference, January 1997

- Greer, Bruce; Henry, Greg
- Proceedings of the 1997 ACM/IEEE conference on Supercomputing (CDROM) - Supercomputing '97, p. 1-13