Early Experiences Writing Performance Portable OpenMP 4 Codes

Joubert, Wayne; Hernandez, Oscar R

Early Experiences Writing Performance Portable OpenMP 4 Codes

Conference · Fri Jan 01 04:00:00 EST 2016

OSTI ID:1324101

Joubert, Wayne ^[1]; Hernandez, Oscar R ^[1]

ORNL

In this paper, we evaluate the recently available directives in OpenMP 4 to parallelize a computational kernel using both the traditional shared memory approach and the newer accelerator targeting capabilities. In addition, we explore various transformations that attempt to increase application performance portability, and examine the expressiveness and performance implications of using these approaches. For example, we want to understand if the target map directives in OpenMP 4 improve data locality when mapped to a shared memory system, as opposed to the traditional first touch policy approach in traditional OpenMP. To that end, we use recent Cray and Intel compilers to measure the performance variations of a simple application kernel when executed on the OLCF s Titan supercomputer with NVIDIA GPUs and the Beacon system with Intel Xeon Phi accelerators attached. To better understand these trade-offs, we compare our results from traditional OpenMP shared memory implementations to the newer accelerator programming model when it is used to target both the CPU and an attached heterogeneous device. We believe the results and lessons learned as presented in this paper will be useful to the larger user community by providing guidelines that can assist programmers in the development of performance portable code.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)

Sponsoring Organization:: USDOE Office of Science (SC)

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1324101

Country of Publication:: United States

Language:: English

Similar Records

Towards Achieving Performance Portability Using Directives for Accelerators

Conference · Tue Nov 01 00:00:00 EDT 2016 · OSTI ID:1567436

Evaluating Performance Portability of Accelerator Programming Models using SPEC ACCEL 1.2 Benchmarks

Conference · Sun Jul 01 00:00:00 EDT 2018 · OSTI ID:1468172

Performance-Portable GPU Acceleration of the EFIT Tokamak Plasma Equilibrium Reconstruction Code

Conference · Sat Nov 11 23:00:00 EST 2023 · Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis · OSTI ID:2477210

Early Experiences Writing Performance Portable OpenMP 4 Codes

Citation Formats

Similar Records

Related Subjects