Index Sets and Vectorization
Vectorization is data parallelism (SIMD, SIMT, etc.) - extension of ISA enabling the same instruction to be performed on multiple data items simultaeously. Many/most CPUs support vectorization in some form. Vectorization is difficult to enable, but can yield large efficiency gains. Extra programmer effort is required because: (1) not all algorithms can be vectorized (regular algorithm structure and fine-grain parallelism must be used); (2) most CPUs have data alignment restrictions for load/store operations (obey or risk incorrect code); (3) special directives are often needed to enable vectorization; and (4) vector instructions are architecture-specific. Vectorization is the best way to optimize for power and performance due to reduced clock cycles. When data is organized properly, a vector load instruction (i.e. movaps) can replace 'normal' load instructions (i.e. movsd). Vector operations can potentially have a smaller footprint in the instruction cache when fewer instructions need to be executed. Hybrid index sets insulate users from architecture specific details. We have applied hybrid index sets to achieve optimal vectorization. We can extend this concept to handle other programming models.
- Research Organization:
- Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- W-7405-ENG-48
- OSTI ID:
- 1046799
- Report Number(s):
- LLNL-CONF-543971; TRN: US201215%%539
- Resource Relation:
- Conference: Presented at: Emerging Technologies in HPC Application Development, Livermore, CA, United States, Mar 19 - Mar 21, 2012
- Country of Publication:
- United States
- Language:
- English
Similar Records
MULTI-CORE AND OPTICAL PROCESSOR RELATED APPLICATIONS RESEARCH AT OAK RIDGE NATIONAL LABORATORY
Quantum Monte Carlo Endstation for Petascale Computing