Optimized scalar promotion with load and splat SIMD instructions
Abstract
Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1107622
- Patent Number(s):
- 8572586
- Application Number:
- 13/555,435
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Eichenberger, Alexander E, Gschwind, Michael K, and Gunnels, John A. Optimized scalar promotion with load and splat SIMD instructions. United States: N. p., 2013.
Web.
Eichenberger, Alexander E, Gschwind, Michael K, & Gunnels, John A. Optimized scalar promotion with load and splat SIMD instructions. United States.
Eichenberger, Alexander E, Gschwind, Michael K, and Gunnels, John A. Tue .
"Optimized scalar promotion with load and splat SIMD instructions". United States. https://www.osti.gov/servlets/purl/1107622.
@article{osti_1107622,
title = {Optimized scalar promotion with load and splat SIMD instructions},
author = {Eichenberger, Alexander E and Gschwind, Michael K and Gunnels, John A},
abstractNote = {Mechanisms for optimizing scalar code executed on a single instruction multiple data (SIMD) engine are provided. Placement of vector operation-splat operations may be determined based on an identification of scalar and SIMD operations in an original code representation. The original code representation may be modified to insert the vector operation-splat operations based on the determined placement of vector operation-splat operations to generate a first modified code representation. Placement of separate splat operations may be determined based on identification of scalar and SIMD operations in the first modified code representation. The first modified code representation may be modified to insert or delete separate splat operations based on the determined placement of the separate splat operations to generate a second modified code representation. SIMD code may be output based on the second modified code representation for execution by the SIMD engine.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Oct 29 00:00:00 EDT 2013},
month = {Tue Oct 29 00:00:00 EDT 2013}
}
Works referenced in this record:
Optimized Scalar Promotion with Load and Splat SIMD Instructions
patent-application, December 2009
- Eichenberger, Alexandre E.; GSchwind, Michael K.; Gunnels, JOhn A.
- US Patent Application 12/134495; 20090307656
Method for vectorizing and executing on an SIMD machine outer loops in the presence of recurrent inner loops
patent, December 1987
- Scarborough, Randolph G.
- US Patent Document 4,710,872
Methods and systems for developing data flow programs
patent, June 2006
- Lewis, Brad; Boucher, Michael; Horton, Noah
- US Patent Document 7,065,634
Two dimensional addressing of a matrix-vector register array
patent, June 2008
- Sandon, Peter A.; West, R. Michael P.
- US Patent Document 7,386,703
Parallel processor system for processing natural concurrencies and method therefor
patent, June 1991
- Morrison, Gordon E.; Brooks, Christopher Bancroft; Gluck, Frederick G.
- US Patent Document 5,021,945
Vectorization in a SIMdD DSP architecture
patent, December 2007
- Ben-David, Shay; Naishlos, Dorit; Shvadron, Uzi
- US Patent Document 7,313,788
Dynamic generation of multimedia code for image processing
patent, January 2007
- Sigmund, Ulrich
- US Patent Document 7,168,069
Method and apparatus for providing hardware assistance for data access coverage on dynamically allocated data
patent, November 2007
- Dimpsey, Robert Tod; Levine, Frank Eliot; Urquhart, Robert John
- US Patent Document 7,296,130
Autonomic method and apparatus for counting branch instructions to generate branch statistics meant to improve branch predictions
patent, November 2007
- DeWitt, Jr., Jimmie Earl; Levine, Frank Eliot; Richardson, Christopher Michael
- US Patent Document 7,293,164
Method and apparatus for counting instruction execution and data accesses for specific types of instructions
patent, August 2007
- DeWitt, Jr., Jimmie Earl; Levine, Frank Eliot; Richardson, Christopher Michael
- US Patent Document 7,257,657
Autonomic method and apparatus for local program code reorganization using branch count per instruction hardware
patent, October 2007
- DeWitt, Jr., Jimmie Earl; Levine, Frank Eliot; Richardson, Christopher Michael
- US Patent Document 7,290,255
Method and apparatus for providing hardware assistance for code coverage
patent, November 2007
- Dimpsey, Robert Tod; Levine, Frank Eliot; Urquhart, Robert John
- US Patent Document 7,299,319
Loop optimization with mapping code on an architecture
patent, August 2004
- Danckaert, Koen; Catthoor, Francky
- US Patent Document 6,772,415
