Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

On a Simplified Approach to Achieve Parallel Performance and Portability Across CPU and GPU Architectures

Journal Article · · Information
DOI:https://doi.org/10.3390/info15110673· OSTI ID:2475228
 [1];  [1];  [1];  [1];  [2];  [1];  [3];  [1];  [1];  [1];  [1];  [4];  [5];  [6];  [7]
  1. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
  2. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Mississippi State University, Mississippi State, MS (United States)
  3. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); University of Minnesota, Minneapolis, MN (United States)
  4. University of New Hampshire, Durham, NH (United States)
  5. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Texas A & M University, College Station, TX (United States)
  6. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); University of Colorado, Boulder, CO (United States)
  7. Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); AMD corporation, Santa Clara, CA (United States)
This paper presents software advances to easily exploit computer architectures consisting of a multi-core CPU and CPU+GPU to accelerate diverse types of high-performance computing (HPC) applications using a single code implementation. The paper describes and demonstrates the performance of the open-source C++ matrix and array (MATAR) library that uniquely offers: (1) a straightforward syntax for programming productivity, (2) usable data structures for data-oriented programming (DOP) for performance, and (3) a simple interface to the open-source C++ Kokkos library for portability and memory management across CPUs and GPUs. The portability across architectures with a single code implementation is achieved by automatically switching between diverse fine-grained parallelism backends (e.g., CUDA, HIP, OpenMP, pthreads, etc.) at compile time. The MATAR library solves many longstanding challenges associated with easily writing software that can run in parallel on any computer architecture. This work benefits projects seeking to write new C++ codes while also addressing the challenges of quickly making existing Fortran codes performant and portable over modern computer architectures with minimal syntactical changes from Fortran to C++. We demonstrate the feasibility of readily writing new C++ codes and modernizing existing codes with MATAR to be performant, parallel, and portable across diverse computer architectures.
Research Organization:
Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Sponsoring Organization:
USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
Grant/Contract Number:
89233218CNA000001
OSTI ID:
2475228
Alternate ID(s):
OSTI ID: 2558047
Report Number(s):
LA-UR--22-20105
Journal Information:
Information, Journal Name: Information Journal Issue: 11 Vol. 15; ISSN 2078-2489
Publisher:
MDPICopyright Statement
Country of Publication:
United States
Language:
English

References (34)

Zur kinetischen Theorie der Wärmeleitung in Kristallen journal January 1929
A 3D finite element ALE method using an approximate Riemann solution: 3D FINITE ELEMENT ALE METHOD journal August 2016
Random Wave Closures journal March 1969
On the spectral dissipation of ocean waves due to white capping journal March 1974
Weak turbulence of capillary waves journal January 1971
A logical calculus of the ideas immanent in nervous activity journal December 1943
On the Energy Cascade of 3-Wave Kinetic Equations: Beyond Kolmogorov–Zakharov Solutions journal December 2019
Parallel 3D topology optimization with multiple constraints and objectives journal September 2023
A deep learning approximation of non-stationary solutions to wave kinetic equations journal May 2024
A fourth-order Lagrangian discontinuous Galerkin method using a hierarchical orthogonal basis on curvilinear grids journal April 2022
A higher-order Lagrangian discontinuous Galerkin hydrodynamic method for solid dynamics journal August 2019
A cell-centered Lagrangian Godunov-like method for solid dynamics journal August 2013
A parallel and performance portable implementation of a full-field crystal plasticity model journal July 2024
A high-order Lagrangian discontinuous Galerkin hydrodynamic method for quadratic cells using a subcell mesh stabilization scheme journal June 2019
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns journal December 2014
MATAR: A performance portability and productivity implementation of data-oriented design with Kokkos journal November 2021
New large-strain FFT-based formulation and its application to model strain localization in nano-metallic laminates and other strongly anisotropic crystalline materials journal March 2022
ELEMENTS: A high-order finite element library in C++ journal July 2019
On the non-linear energy transfer in a gravity-wave spectrum Part 1. General theory journal April 1962
Collective dynamics of ‘small-world’ networks journal June 1998
A model for the global variation in oceanic depth and heat flow with lithospheric age journal September 1992
Nonlinear interactions of random waves in a dispersive medium
  • Benney, D. J.; Saffman, Phillip Geoffrey
  • Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, Vol. 289, Issue 1418, p. 301-320 https://doi.org/10.1098/rspa.1966.0013
journal January 1966
Three-dimensional direct numerical simulation of free-surface magnetohydrodynamic wave turbulence journal June 2022
LIFT: A functional data-parallel IR for high-performance GPU code generation conference February 2017
PACXX: Towards a Unified Programming Model for Programming Accelerators Using C++14 conference November 2014
RAJA: Portable Performance for Large-Scale Scientific Applications conference November 2019
Incorporating Performance Portability and Data-Oriented Design in Phase-Field Modeling conference August 2022
SWAGE: A 3D Arbitrary-Order Element Mesh Library to Support Diverse Numerical Methods conference August 2022
Multidimensional Staggered Grid Residual Distribution Scheme for Lagrangian Hydrodynamics journal January 2020
A Numerical Scheme for Wave Turbulence: 3-Wave Kinetic Equations journal July 2023
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF) journal January 2021
Experiences with implementing Kokkos’ SYCL backend conference April 2024
Algorithm 97: Shortest path journal June 1962
Experiments in Surface Gravity–Capillary Wave Turbulence journal January 2022