Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Automatic Halo Management for the Uintah GPU-Heterogeneous Asynchronous Many-Task Runtime

Journal Article · · International Journal of Parallel Programming

The Uintah computational framework is used for the parallel solution of partial differential equations on adaptive mesh refinement grids using modern supercomputers. Uintah is structured with an application layer and a separate runtime system. Uintah is based on a distributed directed acyclic graph of computational tasks, with a task scheduler that efficiently schedules and executes these tasks on both CPU cores and on-node accelerators. The runtime system identifies task dependencies, creates a task graph prior to the execution of these tasks, automatically generates MPI message tags, and automatically performs halo transfers for simulation variables. Automating halo transfers in a heterogeneous environment poses significant challenges when tasks compute within a few milliseconds, as runtime overhead affects wall time execution, or when simulation variables require large halos spanning most or all of the computational domain, as task dependencies become expensive to process. These challenges are magnified at production scale when application developers require each compute node perform thousands of different halo transfers among thousands simulation variables. The principal contribution of this work is to (1) identify and address inefficiencies that arise when mapping tasks onto the GPU in the presence of automated halo transfers, (2) implement new schemes to reduce runtime system overhead, (3) minimize application developer involvement with the runtime, and (4) show overhead reduction results from these improvements.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Sandia National Laboratories (SNL), Albuquerque, NM, and Livermore, CA (United States)
Sponsoring Organization:
USDOE Office of Science; USDOE National Nuclear Security Administration (NNSA)
DOE Contract Number:
NA0002375; AC05-00OR22725
OSTI ID:
1567537
Journal Information:
International Journal of Parallel Programming, Journal Name: International Journal of Parallel Programming Journal Issue: 5-6 Vol. 47; ISSN 0885-7458
Publisher:
Springer
Country of Publication:
United States
Language:
English

References (22)

Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs
  • Peterson, Brad; Humphrey, Alan; Schmidt, John
  • Proceedings of the Third International Workshop on Extreme Scale Programming Models and Middleware - ESPM2'17 https://doi.org/10.1145/3152041.3152082
conference January 2017
PTG: An Abstraction for Unhindered Parallelism
  • Danalis, Anthony; Bosilca, George; Bouteiller, Aurelien
  • 2014 Fourth International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC) https://doi.org/10.1109/WOLFHPC.2014.8
conference November 2014
Radiative Heat Transfer Calculation on 16384 GPUs Using a Reverse Monte Carlo Ray Tracing Approach with Adaptive Mesh Refinement conference May 2016
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures journal November 2010
Graph-Based Software Design for Managing Complexity and Enabling Concurrency in Multiphysics PDE Software journal November 2012
Massively Parallel Simulations of Spread of Infectious Diseases over Realistic Social Networks conference May 2017
Performance Portability of a GPU Enabled Factorization with the DAGuE Framework conference September 2011
An investigation of Unified Memory Access performance in CUDA conference September 2014
Using hybrid parallelism to improve memory use in the Uintah framework conference January 2011
Regent: a high-productivity programming language for HPC with logical regions
  • Slaughter, Elliott; Lee, Wonchan; Treichler, Sean
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '15 https://doi.org/10.1145/2807591.2807629
conference January 2015
Nebo: An efficient, parallel, and portable domain-specific language for numerically solving partial differential equations journal March 2017
GPU-Aware Non-contiguous Data Movement In Open MPI
  • Wu, Wei; Bosilca, George; vandeVaart, Rolf
  • Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing - HPDC '16 https://doi.org/10.1145/2907294.2907317
conference January 2016
Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory journal November 2016
Kokkos Array performance-portable manycore programming model
  • Edwards, H. Carter; Sunderland, Daniel
  • Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores - PMAM '12 https://doi.org/10.1145/2141702.2141703
conference January 2012
DAGuE: A generic distributed DAG engine for High Performance Computing journal January 2012
Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers
  • Meng, Qingyu; Humphrey, Alan; Schmidt, John
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '13 https://doi.org/10.1145/2503210.2503250
conference January 2013
Reducing overhead in the Uintah framework to support short-lived tasks on GPU-heterogeneous architectures
  • Peterson, Brad; Dasari, Harish; Humphrey, Alan
  • Proceedings of the 5th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing - WOLFHPC '15 https://doi.org/10.1145/2830018.2830023
conference January 2015
CHARM++: a portable concurrent object oriented system based on C++ journal October 1993
Spatial Domain-Based Parallelism in Large-Scale, Participating-Media, Radiative Transport Applications journal June 1997
The Discrete Operator Approach to the Numerical Solution of Partial Differential Equations conference June 2012
Extending the Uintah Framework through the Petascale Modeling of Detonation in Arrays of High Explosive Devices journal January 2016
Radiation modeling using the Uintah heterogeneous CPU/GPU runtime system
  • Humphrey, Alan; Meng, Qingyu; Berzins, Martin
  • Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment on Bridging from the eXtreme to the campus and beyond - XSEDE '12 https://doi.org/10.1145/2335755.2335791
conference January 2012

Similar Records

Addressing Global Data Dependencies in Heterogeneous Asynchronous Runtime Systems on GPUs
Conference · Wed Nov 01 00:00:00 EDT 2017 · Proceedings of the 3rd International IEEE Workshop on Extreme Scale Programming Models and Middleware · OSTI ID:1582428

The uintah framework: a unified heterogeneous task scheduling and runtime system
Conference · Thu Nov 01 00:00:00 EDT 2012 · 2012 SC Companion: High Performance Computing, Networking Storage and Analysis; 10-16 Nov. 2012; Salt Lake City, UT, USA · OSTI ID:1567606

Investigating applications portability with the Uintah DAG-based runtime system on PetaScale supercomputers
Conference · Mon Dec 31 23:00:00 EST 2012 · OSTI ID:1567631