Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Introducing control flow into vectorized code.

Conference ·

Single instruction multiple data (SIMD) functional units are ubiquitous in modern microprocessors. Effective use of these SIMD functional units is essential in achieving the highest possible performance. Automatic generation of SIMD instructions in the presence of control flow is challenging, however, not only because SIMD code is hard to generate in the presence of arbitrarily complex control flow, but also because the SIMD code executing the instructions in all control paths may slow compared to the scalar original, which may bypass a large portion of the code. One promising technique introduced recently involves inserting branches-on-superword-condition-codes (BOSCCs) to bypass vector instructions. In this paper, we describe two techniques that improve on the previous approach. First, BOSCCs are generated in a nested fashion so that even BOSCCs themselves can be bypassed by other BOSCCs. Second, we generate all vec any instructions to bypass even some predicate-defining instructions. We implemented these techniques in a vectorizing compiler. On 14 kernels, the compiler achieves distinct speedups, including 1.99X over the previous technique that generates single-level BOSCCs and vec any ne only.

Research Organization:
Argonne National Laboratory (ANL)
Sponsoring Organization:
SC
DOE Contract Number:
AC02-06CH11357
OSTI ID:
971147
Report Number(s):
ANL/MCS/CP-59118
Country of Publication:
United States
Language:
ENGLISH

Similar Records

Effective Vectorization with OpenMP 4.5
Technical Report · Tue Feb 28 23:00:00 EST 2017 · OSTI ID:1351758

SIMD programming by expansion.
Conference · Sun Dec 31 23:00:00 EST 2006 · OSTI ID:982619

Implementation of McMurchie–Davidson Algorithm for Gaussian AO Integrals Suited for SIMD Processors
Journal Article · Mon Oct 13 00:00:00 EDT 2025 · The Journal of Physical Chemistry A · OSTI ID:3000380