Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Evaluating asynchronous Schwarz solvers on GPUs

Journal Article · · International Journal of High Performance Computing Applications
 [1];  [1];  [2]
  1. Karlsruhe Institute of Technology, Karlsruhe, Germany
  2. Karlsruhe Institute of Technology, Karlsruhe, Germany, University of Tennessee, Knoxville, USA

With the commencement of the exascale computing era, we realize that the majority of the leadership supercomputers are heterogeneous and massively parallel. Even a single node can contain multiple co-processors such as GPUs and multiple CPU cores. For example, ORNL’s Summit accumulates six NVIDIA Tesla V100 GPUs and 42 IBM Power9 cores on each node. Synchronizing across compute resources of multiple nodes can be prohibitively expensive. Hence, it is necessary to develop and study asynchronous algorithms that circumvent this issue of bulk-synchronous computing. In this study, we examine the asynchronous version of the abstract Restricted Additive Schwarz method as a solver. We do not explicitly synchronize, but allow the communication between the sub-domains to be completely asynchronous, thereby removing the bulk synchronous nature of the algorithm.

We accomplish this by using the one-sided Remote Memory Access (RMA) functions of the MPI standard. We study the benefits of using such an asynchronous solver over its synchronous counterpart. We also study the communication patterns governed by the partitioning and the overlap between the sub-domains on the global solver. Finally, we show that this concept can render attractive performance benefits over the synchronous counterparts even for a well-balanced problem.

Sponsoring Organization:
USDOE
Grant/Contract Number:
AC05-00OR22725
OSTI ID:
1778413
Journal Information:
International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 3 Vol. 35; ISSN 1094-3420
Publisher:
SAGE PublicationsCopyright Statement
Country of Publication:
United States
Language:
English

References (21)

Algorithms for distributed termination detection journal September 1987
Asynchronous optimized Schwarz methods with and without overlap journal March 2017
Weighted max norms, splittings, and overlapping additive Schwarz iterations journal August 1999
Algebraic theory of multiplicative Schwarz methods journal October 2001
Towards Optimized Schwarz Methods for the Navier–Stokes Equations journal April 2015
Derivation of a termination detection algorithm for distributed computations journal June 1983
On asynchronous iterations journal November 2000
Asynchronous iterative sub-structuring methods journal March 2018
Performance of asynchronous optimized Schwarz with one-sided communication journal August 2019
2.1 Summit and Sierra: Designing AI/HPC Supercomputers conference February 2019
Ultra-Performance Pascal GPU and NVLink Interconnect journal March 2017
A decentralized convergence detection algorithm for asynchronous parallel iterative algorithms journal January 2005
An Algebraic Convergence Theory for Restricted Additive Schwarz Methods Using Weighted Max Norms journal January 2001
A Restricted Additive Schwarz Preconditioner for General Sparse Linear Systems journal January 1999
Algorithm 887: CHOLMOD, Supernodal Sparse Cholesky Factorization and Update/Downdate journal October 2008
Asynchronous Iterative Methods for Multiprocessors journal April 1978
Load-balancing Sparse Matrix Vector Product Kernels on GPUs journal April 2020
Distributed Termination journal January 1980
The deal.II library, Version 9.0 journal December 2018
Convergence of the multiplicative Schwarz method for singularly perturbed convection-diffusion problems discretized on a Shishkin mesh journal January 2018
Restricted Additive Schwarz Preconditioner for Elliptic Equations with Jump Coefficients journal September 2016

Similar Records

Performance of asynchronous optimized Schwarz with one-sided communication
Journal Article · Wed May 15 00:00:00 EDT 2019 · Parallel Computing · OSTI ID:1577477

Asynchronous Iterative Solvers for Extreme-Scale Computing
Technical Report · Wed Feb 03 23:00:00 EST 2021 · OSTI ID:1764239

Related Subjects