Early Experiences Writing Performance Portable OpenMP 4 Codes
Conference
·
OSTI ID:1324101
- ORNL
In this paper, we evaluate the recently available directives in OpenMP 4 to parallelize a computational kernel using both the traditional shared memory approach and the newer accelerator targeting capabilities. In addition, we explore various transformations that attempt to increase application performance portability, and examine the expressiveness and performance implications of using these approaches. For example, we want to understand if the target map directives in OpenMP 4 improve data locality when mapped to a shared memory system, as opposed to the traditional first touch policy approach in traditional OpenMP. To that end, we use recent Cray and Intel compilers to measure the performance variations of a simple application kernel when executed on the OLCF s Titan supercomputer with NVIDIA GPUs and the Beacon system with Intel Xeon Phi accelerators attached. To better understand these trade-offs, we compare our results from traditional OpenMP shared memory implementations to the newer accelerator programming model when it is used to target both the CPU and an attached heterogeneous device. We believe the results and lessons learned as presented in this paper will be useful to the larger user community by providing guidelines that can assist programmers in the development of performance portable code.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1324101
- Country of Publication:
- United States
- Language:
- English
Similar Records
Towards Achieving Performance Portability Using Directives for Accelerators
Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks
Performance-Portable GPU Acceleration of the EFIT Tokamak Plasma Equilibrium Reconstruction Code
Conference
·
Tue Nov 01 00:00:00 EDT 2016
·
OSTI ID:1567436
Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks
Conference
·
Sun Jul 01 00:00:00 EDT 2018
·
OSTI ID:1468172
Performance-Portable GPU Acceleration of the EFIT Tokamak Plasma Equilibrium Reconstruction Code
Conference
·
Sat Nov 11 23:00:00 EST 2023
· Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis
·
OSTI ID:2477210