Early Experiences Writing Performance Portable OpenMP 4 Codes

Joubert, Wayne; Hernandez, Oscar R

Title: Early Experiences Writing Performance Portable OpenMP 4 Codes

Conference · Fri Jan 01 00:00:00 EST 2016

OSTI ID:1324101

Joubert, Wayne ^[1]; Hernandez, Oscar R ^[1]

ORNL

In this paper, we evaluate the recently available directives in OpenMP 4 to parallelize a computational kernel using both the traditional shared memory approach and the newer accelerator targeting capabilities. In addition, we explore various transformations that attempt to increase application performance portability, and examine the expressiveness and performance implications of using these approaches. For example, we want to understand if the target map directives in OpenMP 4 improve data locality when mapped to a shared memory system, as opposed to the traditional first touch policy approach in traditional OpenMP. To that end, we use recent Cray and Intel compilers to measure the performance variations of a simple application kernel when executed on the OLCF s Titan supercomputer with NVIDIA GPUs and the Beacon system with Intel Xeon Phi accelerators attached. To better understand these trade-offs, we compare our results from traditional OpenMP shared memory implementations to the newer accelerator programming model when it is used to target both the CPU and an attached heterogeneous device. We believe the results and lessons learned as presented in this paper will be useful to the larger user community by providing guidelines that can assist programmers in the development of performance portable code.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)

Sponsoring Organization:: USDOE Office of Science (SC)

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1324101

Resource Relation:: Conference: Cray User Group Conference 2016, London, United Kingdom, 20160508, 20160512

Country of Publication:: United States

Language:: English

Similar Records

Towards Achieving Performance Portability Using Directives for Accelerators

Conference · Tue Nov 01 00:00:00 EDT 2016 · OSTI ID:1324101

Lopez, M. Graham; Larrea, Veronica Vergara; Joubert, Wayne; +4 more

GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5

Journal Article · Sun Jul 01 00:00:00 EDT 2018 · Computer Physics Communications · OSTI ID:1324101

Clay, M. P.; Buaria, D.; Yeung, P. K.; +1 more

Investigation of Portable Event-Based Monte Carlo Transport Using the NVIDIA Thrust Library

Journal Article · Wed Jun 15 00:00:00 EDT 2016 · Transactions of the American Nuclear Society · OSTI ID:1324101

Bleile, Ryan C.; Brantley, Patrick S.; Dawson, Shawn A.; +2 more

Title: Early Experiences Writing Performance Portable OpenMP 4 Codes

Citation Formats

Similar Records

Related Subjects