skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Manage OpenMP GPU Data Environment Under Unified Address Space

Abstract

OpenMP has supported the offload of computations to accelerators such as GPUs since version 4.0. A crucial aspect in OpenMP offloading is to manage the accelerator data environment. Currently, this has to be explicitly programmed by users, which is non-trival and often results in suboptimal performance. The unified memory feature available in recent GPU architectures introduces another option, implicit management. However, our experiments show that it incurs several performance issues, especially under GPU memory oversubscription. In this paper, we propose a compiler and runtime collaborative approach to manage OpenMP GPU data under unified memory. In our framework, the compiler performs data reuse analysis to assist runtime data management. The runtime combines static and dynamic information to make optimized data management decisions.We have implement the proposed technology in the LLVM framework. The evaluation shows our method can achieve significant performance improvement for OpenMP GPU offloading.

Authors:
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Brookhaven National Lab. (BNL), Upton, NY (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (SC-21)
OSTI Identifier:
1484438
Report Number(s):
BNL-209639-2018-COPA
Journal ID: ISSN 0302-9743
DOE Contract Number:  
SC0012704
Resource Type:
Conference
Resource Relation:
Journal Volume: 11128; Conference: International Workshop on OpenMP 2018, Barcelona, Spain, 9/26/2018 - 9/28/2018
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Data management; Unified memory; OpenMP offloading; Compiler; Runtime; LLVM

Citation Formats

Li, Lingda. Manage OpenMP GPU Data Environment Under Unified Address Space. United States: N. p., 2018. Web. doi:10.1007/978-3-319-98521-3_5.
Li, Lingda. Manage OpenMP GPU Data Environment Under Unified Address Space. United States. https://doi.org/10.1007/978-3-319-98521-3_5
Li, Lingda. 2018. "Manage OpenMP GPU Data Environment Under Unified Address Space". United States. https://doi.org/10.1007/978-3-319-98521-3_5. https://www.osti.gov/servlets/purl/1484438.
@article{osti_1484438,
title = {Manage OpenMP GPU Data Environment Under Unified Address Space},
author = {Li, Lingda},
abstractNote = {OpenMP has supported the offload of computations to accelerators such as GPUs since version 4.0. A crucial aspect in OpenMP offloading is to manage the accelerator data environment. Currently, this has to be explicitly programmed by users, which is non-trival and often results in suboptimal performance. The unified memory feature available in recent GPU architectures introduces another option, implicit management. However, our experiments show that it incurs several performance issues, especially under GPU memory oversubscription. In this paper, we propose a compiler and runtime collaborative approach to manage OpenMP GPU data under unified memory. In our framework, the compiler performs data reuse analysis to assist runtime data management. The runtime combines static and dynamic information to make optimized data management decisions.We have implement the proposed technology in the LLVM framework. The evaluation shows our method can achieve significant performance improvement for OpenMP GPU offloading.},
doi = {10.1007/978-3-319-98521-3_5},
url = {https://www.osti.gov/biblio/1484438}, journal = {},
issn = {0302-9743},
number = ,
volume = 11128,
place = {United States},
year = {2018},
month = {9}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

Page Placement Strategies for GPUs within Heterogeneous Memory Systems
conference, January 2015

  • Agarwal, Neha; Nellans, David; Stephenson, Mark
  • Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '15
  • https://doi.org/10.1145/2694344.2694381

Offloading Support for OpenMP in Clang and LLVM
conference, November 2016


Rodinia: A benchmark suite for heterogeneous computing
conference, October 2009


Directive-Based Partitioning and Pipelining for Graphics Processing Units
conference, May 2017


Automatic CPU-GPU communication management and optimization
conference, January 2011

  • Jablin, Thomas B.; Prabhu, Prakash; Jablin, James A.
  • Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation - PLDI '11
  • https://doi.org/10.1145/1993498.1993516

High performance cache replacement using re-reference interval prediction (RRIP)
conference, January 2010


LLVM: A compilation framework for lifelong program analysis & transformation
conference, January 2004


Optimal bypass monitor for high performance last-level caches
conference, January 2012


Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading
conference, January 2017


Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme
conference, January 2012

  • Pai, Sreepathi; Govindarajan, R.; Thazhuthaveetil, Matthew J.
  • Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12
  • https://doi.org/10.1145/2370816.2370824

Adaptive insertion policies for high performance caching
conference, January 2007


OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems
journal, May 2010


Optimizing bandwidth and power of graphics memory with hybrid memory technologies and adaptive data migration
conference, January 2012