Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

PETSc/TAO developments for GPU-based early exascale systems

Journal Article · · The International Journal of High Performance Computing Applications

The Portable Extensible Toolkit for Scientific Computation (PETSc) library provides scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization via the Toolkit for Advanced Optimization (TAO). PETSc is used in dozens of scientific fields and is an important building block for many simulation codes. During the U.S. Department of Energy’s Exascale Computing Project, the PETSc team has made substantial efforts to enable efficient utilization of the massive fine-grain parallelism present within exascale compute nodes and to enable performance portability across exascale architectures. We recap some of the challenges that designers of numerical libraries face in such an endeavor, and then discuss the many developments we have made, which include the addition of new GPU backends, features supporting efficient on-device matrix assembly, better support for asynchronicity and GPU kernel concurrency, and new communication infrastructure. In conclusion, we evaluate the performance of these developments on some pre-exascale systems as well as the early exascale systems Frontier and Aurora, using compute kernel, communication layer, solver, and mini-application benchmark studies, and then close with a few observations drawn from our experiences on the tension between portable performance and other goals of numerical libraries.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF); USDOE National Nuclear Security Administration (NNSA); National Science Foundation (NSF)
Grant/Contract Number:
AC02-06CH11357; AC05-00OR22725
OSTI ID:
2997026
Alternate ID(s):
OSTI ID: 2507273
OSTI ID: 2566101
Journal Information:
The International Journal of High Performance Computing Applications, Journal Name: The International Journal of High Performance Computing Applications Journal Issue: 2 Vol. 39; ISSN 1094-3420; ISSN 1741-2846
Publisher:
SAGE PublicationsCopyright Statement
Country of Publication:
United States
Language:
English

References (28)

Assembly of finite element methods on graphics processors journal August 2010
Representations of quasi-Newton matrices and their use in limited memory methods journal January 1994
On the limited memory BFGS method for large scale optimization journal August 1989
KSPHPDDM and PCHPDDM: Extending PETSc with advanced Krylov methods and robust multilevel overlapping Schwarz preconditioners journal February 2021
A fully non-linear multi-species Fokker–Planck–Landau collision operator for simulation of fusion plasma journal June 2016
Toward performance-portable PETSc for GPU-based exascale systems journal December 2021
Targeting performance and user-friendliness: GPU-accelerated finite element computation with automated code generation in FEniCS journal November 2023
Conservative discretization of the Landau collision integral journal March 2017
Landau collision operator in the CUDA programming model applied to thermal quench plasmas conference May 2022
Performance Portability Evaluation of OpenCL Benchmarks across Intel and NVIDIA Platforms conference May 2020
Composable Linear Solvers for Multiphysics conference June 2012
Exascale Computing in the United States journal January 2019
The PETSc Community as Infrastructure journal May 2022
RAJA: Portable Performance for Large-Scale Scientific Applications conference November 2019
The PetscSF Scalable Communication Layer journal April 2022
Kokkos 3: Programming Model Extensions for the Exascale Era journal January 2021
Performance Portable Batched Sparse Linear Solvers journal May 2023
Variable Metric Method for Minimization journal February 1991
A Transpose-Free Quasi-Minimal Residual Algorithm for Non-Hermitian Linear Systems journal March 1993
ViennaCL---Linear Algebra Library for Multi- and Many-Core Architectures journal January 2016
Landau Collision Integral Solver with Adaptive Mesh Refinement on Emerging Architectures journal January 2017
PETSc TSAdjoint: A Discrete Adjoint ODE Solver for First-Order and Second-Order Sensitivity Analysis journal January 2022
A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling journal January 2001
Algorithm 915, SuiteSparseQR journal November 2011
Ginkgo : A Modern Linear Operator Algebra Framework for High Performance Computing journal February 2022
MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming conference September 2022
PETScML: Second-Order Solvers for Training Regression Problems in Scientific Machine Learning conference June 2024
AMReX: a framework for block-structured adaptive mesh refinement journal May 2019

Similar Records

Toward performance-portable PETSc for GPU-based exascale systems
Journal Article · Fri Sep 10 00:00:00 EDT 2021 · Parallel Computing · OSTI ID:1834595

PETSc/TAO Users Manual V.3.21
Technical Report · Fri Mar 29 00:00:00 EDT 2024 · OSTI ID:2337606