Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

Advanced Search OptionsAdvanced Search queries use a traditional Term Search. For more info, see our FAQ.

All Fields:

Patent Title:

Abstract:

Assignee:

Inventor(s):

Patent Number:

Patent Classification (CPC):

All Classifications
A - human necessities
A01 - agriculture
A21 - baking
A22 - butchering
A23 - foods or foodstuffs
A24 - tobacco
A41 - wearing apparel
A42 - headwear
A43 - footwear
A44 - haberdashery
A45 - hand or travelling articles
A46 - brushware
A47 - furniture
A61 - medical or veterinary science
A62 - life-saving
A63 - sports
A99 - subject matter not otherwise provided for in this section
B - performing operations
B01 - physical or chemical processes or apparatus in general
B02 - crushing, pulverising, or disintegrating
B03 - separation of solid materials using liquids or using pneumatic tables or jigs
B04 - centrifugal apparatus or machines for carrying-out physical or chemical processes
B05 - spraying or atomising in general
B06 - generating or transmitting mechanical vibrations in general
B07 - separating solids from solids
B08 - cleaning
B09 - disposal of solid waste
B21 - mechanical metal-working without essentially removing material
B22 - casting
B23 - machine tools
B24 - grinding
B25 - hand tools
B26 - hand cutting tools
B27 - working or preserving wood or similar material
B28 - working cement, clay, or stone
B29 - working of plastics
B30 - presses
B31 - making articles of paper, cardboard or material worked in a manner analogous to paper
B32 - layered products
B33 - additive manufacturing technology
B41 - printing
B42 - bookbinding
B43 - writing or drawing implements
B44 - decorative arts
B60 - vehicles in general
B61 - railways
B62 - land vehicles for travelling otherwise than on rails
B63 - ships or other waterborne vessels
B64 - aircraft
B65 - conveying
B66 - hoisting
B67 - opening, closing {or cleaning} bottles, jars or similar containers
B68 - saddlery
B81 - microstructural technology
B82 - nanotechnology
B99 - subject matter not otherwise provided for in this section
C - chemistry
C01 - inorganic chemistry
C02 - treatment of water, waste water, sewage, or sludge
C03 - glass
C04 - cements
C05 - fertilisers
C06 - explosives
C07 - organic chemistry
C08 - organic macromolecular compounds
C09 - dyes
C10 - petroleum, gas or coke industries
C11 - animal or vegetable oils, fats, fatty substances or waxes
C12 - biochemistry
C13 - sugar industry
C14 - skins
C21 - metallurgy of iron
C22 - metallurgy
C23 - coating metallic material
C25 - electrolytic or electrophoretic processes
C30 - crystal growth
C40 - combinatorial technology
C99 - subject matter not otherwise provided for in this section
D - textiles
D01 - natural or man-made threads or fibres
D02 - yarns
D03 - weaving
D04 - braiding
D05 - sewing
D06 - treatment of textiles or the like
D07 - ropes
D10 - indexing scheme associated with sublasses of section d, relating to textiles
D21 - paper-making
D99 - subject matter not otherwise provided for in this section
E - fixed constructions
E01 - construction of roads, railways, or bridges
E02 - hydraulic engineering
E03 - water supply
E04 - building
E05 - locks
E06 - doors, windows, shutters, or roller blinds in general
E21 - earth drilling
E99 - subject matter not otherwise provided for in this section
F - mechanical engineering
F01 - machines or engines in general
F02 - combustion engines
F03 - machines or engines for liquids
F04 - positive - displacement machines for liquids
F05 - indexing schemes relating to engines or pumps in various subclasses of classes f01-f04
F15 - fluid-pressure actuators
F16 - engineering elements and units
F17 - storing or distributing gases or liquids
F21 - lighting
F22 - steam generation
F23 - combustion apparatus
F24 - heating
F25 - refrigeration or cooling
F26 - drying
F27 - furnaces
F28 - heat exchange in general
F41 - weapons
F42 - ammunition
F99 - subject matter not otherwise provided for in this section
G - physics
G01 - measuring
G02 - optics
G03 - photography
G04 - horology
G05 - controlling
G06 - computing
G07 - checking-devices
G08 - signalling
G09 - education
G10 - musical instruments
G11 - information storage
G12 - instrument details
G16 - information and communication technology [ict] specially adapted for specific application fields
G21 - nuclear physics
G99 - subject matter not otherwise provided for in this section
H - electricity
H01 - basic electric elements
H02 - generation
H03 - basic electronic circuitry
H04 - electric communication technique
H05 - electric techniques not otherwise provided for
H99 - subject matter not otherwise provided for in this section
Y - new / cross sectional technologies
Y02 - technologies or applications for mitigation or adaptation against climate change
Y04 - information or communication technologies having an impact on other technology areas
Y10 - technical subjects covered by former uspc

More Options ...

Title: Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Abstract

Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.

Inventors:: Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

Issue Date:: Tue Feb 11 00:00:00 EST 2014

Research Org.:: International Business Machines Corp., Armonk, NY (United States)

Sponsoring Org.:: USDOE

OSTI Identifier:: 1119675

Patent Number(s):: 8650240

Application Number:: 12/542,324

Assignee:: International Business Machines Corporation (Armonk, NY)

Patent Classifications (CPCs):: G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING

Show more

G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
G06F17/16 - Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
G06F9/30014 - {with variable precision}
G06F9/30032 - {Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE}
G06F9/30036 - {Instructions to perform operations on packed data, e.g. vector operations}
G06F9/30043 - {LOAD or STORE instructions
G06F9/30109 - {having multiple operands in a single register}

Show less

DOE Contract Number:: B554331

Resource Type:: Patent

Country of Publication:: United States

Language:: English

Subject:: 97 MATHEMATICS AND COMPUTING

Citation Formats


                    Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture.  United States: N. p., 2014. 
        Web.

Copy to clipboard


                    Eichenberger, Alexandre E, Gschwind, Michael K, & Gunnels, John A. Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture.  United States.

Copy to clipboard


                    Eichenberger, Alexandre E, Gschwind, Michael K, and Gunnels, John A. Tue .  
        "Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture".  United States.  https://www.osti.gov/servlets/purl/1119675.

Copy to clipboard


                    
@article{osti_1119675,

  title        = {Complex matrix multiplication operations with data pre-conditioning in a high performance computing architecture},

  author       = {Eichenberger, Alexandre E and Gschwind, Michael K and Gunnels, John A},

  abstractNote = {Mechanisms for performing a complex matrix multiplication operation are provided. A vector load operation is performed to load a first vector operand of the complex matrix multiplication operation to a first target vector register. The first vector operand comprises a real and imaginary part of a first complex vector value. A complex load and splat operation is performed to load a second complex vector value of a second vector operand and replicate the second complex vector value within a second target vector register. The second complex vector value has a real and imaginary part. A cross multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the complex matrix multiplication operation. The partial product is accumulated with other partial products and a resulting accumulated partial product is stored in a result vector register.},

  doi          = {},

  journal      = {},
number       = ,

  volume       = ,

  place        = {United States},

  year         = {Tue Feb 11 00:00:00 EST 2014},

  month        = {Tue Feb 11 00:00:00 EST 2014}

}

Copy to clipboard

Patent:

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Multiprocessor for hardware emulation
patent, August 1996

Beausoleil, William F.; Ng, Tak-kwong; Palmer, Harold R.
US Patent Document 5,551,013
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/5551013

Decoding guest instruction to directly access emulation routines that emulate the guest instructions
patent, November 1996

Davidian, Gary
US Patent Document 5,574,873
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/5574873

Method for emulating guest instructions on a host computer through dynamic recompilation of host instructions
patent, August 1998

Traut, Eric P.
US Patent Document 5,790,825
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/5790825

Processor that decodes a multi-cycle instruction into single-cycle micro-instructions and schedules execution of the micro-instructions
patent, July 1999

Nguyen, Le Trong; Park, Heonchul
US Patent Document 5,923,862
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/5923862

Preprocessing of stored target routines for emulating incompatible instructions on a target processor
patent, December 1999

Scalzi, Casper A.; Schwarz, Eric M.; Starke, William J.
US Patent Document 6,009,261
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6009261

Explicit DST-based filter operating in the DCT domain
patent, September 2000

Kresch, Renato; Merhav, Neri
US Patent Document 6,125,212
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6125212

Symmetrical multiprocessing bus and chipset used for coprocessor support allowing non-native code to run in a system
patent, October 2001

Gorishek, IV, Frank J.; Boswell, Jr., Charles Ray; Smith, David W.
US Patent Document 6,308,255
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6308255

Dynamic optimizing object code translator for architecture emulation and dynamic optimizing object code translation method
patent, October 2002

Lethin, Richard A.; Bank, III, Joseph A.; Garrett, Charles D.
US Patent Document 6,463,582
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6463582

Method and apparatus for vector register with scalar values
patent, March 2003

Choquette, Jack H.
US Patent Document 6,530,011
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6530011

Method and apparatus for obtaining a scalar value directly from a vector register
patent, February 2005

Liao, Yu-Chung C.; Sandon, Peter A.; Cheng, Howard
US Patent Document 6,857,061
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/6857061

Apparatus for efficient LFSR calculation in a SIMD processor
patent, November 2007

Mimar, Tibet
US Patent Document 7,302,627
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7302627

Vector co-processor for configurable and extensible processor architecture
patent, May 2008

Sanghavi, Himanshu A.; Killian, Earl A.; Kennedy, James Robert
US Patent Document 7,376,812
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7376812

Vector processing system
patent, November 2008

Barlow, Stephen; Bailey, Neil; Ramsdale, Timothy
US Patent Document 7,457,941
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7457941

Method and apparatus for vector execution on a scalar machine
patent, September 2009

Colavin, Osvaldo; Rizzo, Davide; Soni, Vineet
US Patent Document 7,594,102
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7594102

Method and system for efficient matrix multiplication in a SIMD processor architecture
patent, January 2011

Mimar, Tibet
US Patent Document 7,873,812
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7873812

System and software for performing matrix multiply extract operations
patent, April 2011

Hansen, Craig; Moussouris, John; Massalin, Alexia
US Patent Document 7,932,910
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7932910

Systems, apparatus, and methods for performing digital pre-distortion with feedback signal adjustment
patent, November 2011

Norris, George B.; Staudinger, Joseph; Chen, Jau-Horng
US Patent Document 8,068,574
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/8068574

Vector processor architecture and methods performed therein
patent-application, April 2004

Demjanenko, Victor
US Patent Application 10/467225; 20040073773
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20040073773

Matrix multiplication in a vector processing system
patent-application, September 2005

Sazegari, Ali
US Patent Application 11/113035; 20050193050
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20050193050

Transferring data from integer to vector registers
patent-application, March 2007

Citron, Daniel; Zaks, Ayal
US Patent Application 11/214348; 20070050598
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20070050598

Programmable digital signal processor having a clustered SIMD microarchitecture including a complex short multiplier and an independent vector load unit
patent-application, August 2007

Liu, Dake; Nilsson, Anders Henrik; Tell, Eric Johan
US Patent Application 11/201841; 20070198815
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20070198815

Matrix multiply with reduced bandwidth requirements
patent-application, November 2007

Juffa, Norbert; Nickolls, John R.
US Patent Application 11/430324; 20070271325
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20070271325

System and Method for Compiling Scalar Code for a Single Instruction Multiple Data (SIMD) Execution Engine
patent-application, September 2008

Gschwind, Michael K.
US Patent Application 12/127857; 20080229066
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20080229066

patent-application,

Optimized Corner Turns for Local Storage and Bandwidth Reduction
patent-application, November 2009

Brokenshire, Daniel A.; Gunnels, John A.; Kistler, Michael D.
US Patent Application 12/125996; 20090292758
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20090292758

Reducing Bandwidth Requirements for Matrix Multiplication
patent-application, December 2009

Brokenshire, Damiel A.; Gunnels, John A.; Kistler, Michael D.
US Patent Application 12/129789; 20090300091
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20090300091

Optimized Scalar Promotion with Load and Splat SIMD Instructions
patent-application, December 2009

Eichenberger, Alexandre E.; GSchwind, Michael K.; Gunnels, JOhn A.
US Patent Application 12/134495; 20090307656
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20090307656

Method and Apparatus for Vector Execution on a Scalar Machine
patent-application, December 2009

Colavin, Osvaldo; Rizzo, Davide; Soni, Vineet
US Patent Application 12/544250; 20090313458
URL: https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20090313458