Heterogeneous computing with OpenMP and Hydra
- University of Illinois at Urbana‐Champaign Champaign Illinois USA
Summary High‐performance computing relies on accelerators (such as GPGPUs) to achieve fast execution of scientific applications. Traditionally, these accelerators have been programmed with specialized languages, such as CUDA or OpenCL. In recent years, OpenMP emerged as a promising alternative for supporting accelerators, providing advantages such as maintaining a single code base for the host and different accelerator types and providing a simple way to extend support for accelerators to existing application codes. Efficiently using this support requires solving several challenges, related to performance, work partitioning, and concurrent execution on multiple device types. In this article, we discuss our experiences with using OpenMP for accelerators and present performance guidelines. We also introduce a library, Hydra, that addresses several of the challenges of using OpenMP for such devices. We apply Hydra to a scientific application, PlasCom2, that has not previously been able to use accelerators. Experiments on three architectures show that Hydra results in performance gains of up to 10× compared with CPU‐only execution. Concurrent execution on the host and GPU resulted in additional gains of up to 20% compared to running on the GPU only.
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- NA0002374
- OSTI ID:
- 1603688
- Journal Information:
- Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 20 Vol. 32; ISSN 1532-0626
- Publisher:
- Wiley Blackwell (John Wiley & Sons)Copyright Statement
- Country of Publication:
- United Kingdom
- Language:
- English
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures
|
journal | November 2010 |
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
|
journal | December 2014 |
A uniform approach for programming distributed heterogeneous computing systems
|
journal | December 2014 |
Efficient Fork-Join on GPUs Through Warp Specialization
|
conference | December 2017 |
Hetero-mark, a benchmark suite for CPU-GPU collaborative computing
|
conference | September 2016 |
Self-Adaptive OmpSs Tasks in Heterogeneous Environments
|
conference | May 2013 |
XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures
|
conference | May 2013 |
Exploring Programming Multi-GPUs Using OpenMP and OpenACC-Based Hybrid Model
|
conference | May 2013 |
Chai: Collaborative heterogeneous applications for integrated-architectures
|
conference | April 2017 |
Legion: Expressing locality and independence with logical regions
|
conference | November 2012 |
A Unified Programming Model for Intra- and Inter-Node Offloading on Xeon Phi Clusters
|
conference | November 2014 |
Directive-based Programming Models for Scientific Applications - A Comparison
|
conference | November 2012 |
HPX: A Task Based Programming Model in a Global Address Space
|
conference | January 2014 |
The Spack package manager: bringing order to HPC software chaos
|
conference | January 2015 |
Performance analysis of OpenMP on a GPU using a CORAL proxy application
|
conference | January 2015 |
DawnCC: Automatic Annotation for Data Parallelism and Offloading
|
journal | May 2017 |
Improving the memory access locality of hybrid MPI applications
|
conference | January 2017 |
Similar Records
OpenMP for Accelerators
Experiences with High-Level Programming Directives for Porting Applications to GPUs