Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Preparing sparse solvers for exascale computing

Journal Article · · Philosophical Transactions of the Royal Society. A, Mathematical, Physical and Engineering Sciences
 [1];  [2];  [3];  [4];  [2];  [4];  [5];  [5];  [2];  [6];  [5];  [2];  [3]
  1. Univ. of Tennessee, Knoxville, TN (United States)
  2. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  3. Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
  4. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
  5. Argonne National Lab. (ANL), Argonne, IL (United States)
  6. Vienna University of Technology, Wien, Wien, Austria

Sparse solvers provide essential functionality for a wide variety of scientific applications. Highly parallel sparse solvers are essential for continuing advances in high-fidelity, multi-physics and multi-scale simulations, especially as we target exascale platforms. This paper describes the challenges, strategies and progress of the US Department of Energy Exascale Computing project towards providing sparse solvers for exascale computing platforms. We address the demands of systems with thousands of high-performance node devices where exposing concurrency, hiding latency and creating alternative algorithms become essential. The efforts described here are works in progress, highlighting current success and upcoming challenges.

Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC04-94AL85000
OSTI ID:
1601440
Alternate ID(s):
OSTI ID: 1604740
OSTI ID: 1607441
OSTI ID: 1770021
Report Number(s):
SAND--2019-10821J; 679361
Journal Information:
Philosophical Transactions of the Royal Society. A, Mathematical, Physical and Engineering Sciences, Journal Name: Philosophical Transactions of the Royal Society. A, Mathematical, Physical and Engineering Sciences Journal Issue: 2166 Vol. 378; ISSN 1364-503X
Publisher:
The Royal Society PublishingCopyright Statement
Country of Publication:
United States
Language:
English

References (46)

Stencil computations for PDE-based applications with examples from DUNE and hypre: Stencil Computations for PDE-based Applications journal February 2017
Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers: Adaptive precision in block-Jacobi preconditioning for iterative solvers journal March 2018
A new parallel domain decomposition method for the adaptive finite element solution of elliptic partial differential equations journal January 2001
Reducing communication in algebraic multigrid using additive variants: REDUCING COMMUNICATION IN AMG WITH ADDITIVE VARIANTS journal February 2014
A low-communication, parallel algorithm for solving PDEs based on range decomposition: RANGE DECOMPOSITION: A LOW COMMUNICATION ALGORITHM FOR SOLVING PDES journal March 2016
Distance-two interpolation for parallel algebraic multigrid journal January 2008
Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs book January 2015
A fast adaptive solver for hierarchically semiseparable representations journal December 2005
An $$\mathcal O (N \log N)$$ O ( N log N )   Fast Direct Solver for Partial Hierarchically Semi-Separable Matrices: With Application to Radial Basis Function Interpolation journal April 2013
Updating incomplete factorization preconditioners for model order reduction journal February 2016
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns journal December 2014
A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems journal September 2019
Basker: Parallel sparse LU factorization utilizing hierarchical parallelism and data layouts journal October 2017
Multithreaded sparse matrix-matrix multiplication for many-core and GPU architectures journal October 2018
Distance-two interpolation for parallel algebraic multigrid journal July 2007
Fast linear algebra-based triangle counting with KokkosKernels
  • Wolf, Michael M.; Deveci, Mehmet; Berry, Jonathan W.
  • 2017 IEEE High-Performance Extreme Computing Conference (HPEC), 2017 IEEE High Performance Extreme Computing Conference (HPEC) https://doi.org/10.1109/HPEC.2017.8091043
conference September 2017
Fast Triangle Counting Using Cilk conference September 2018
ShyLU: A Hybrid-Hybrid Solver for Multicore Platforms
  • Rajamanickam, Sivasankaran; Boman, Erik G.; Heroux, Michael A.
  • 2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2012 IEEE 26th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2012.64
conference May 2012
Parallel Graph Coloring for Manycore Architectures conference May 2016
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices conference May 2018
ParILUT - A Parallel Threshold ILU for GPUs conference May 2019
Tacho: Memory-Scalable Task Parallel Sparse Cholesky Factorization conference May 2018
A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression conference May 2018
An HSS Matrix-Inspired Butterfly-Based Direct Solver for Analyzing Scattering From Two-Dimensional Objects journal January 2017
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster
  • Yamazaki, Ichitaro; Rajamanickam, Sivasankaran; Boman, Erik G.
  • SC14: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2014.81
conference November 2014
Iterative Methods for Sparse Linear Systems book January 2003
Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods journal January 2012
Improving Multifrontal Methods by Means of Block Low-Rank Representations journal January 2015
Communication Avoiding ILU0 Preconditioner journal January 2015
Non-Galerkin Coarse Grids for Algebraic Multigrid journal January 2014
Fine-Grained Parallel Incomplete LU Factorization journal January 2015
Algebraic Multigrid Domain and Range Decomposition (AMG-DD/AMG-RD) journal January 2015
Reducing Parallel Communication in Algebraic Multigrid through Sparsification journal January 2016
ViennaCL---Linear Algebra Library for Multi- and Many-Core Architectures journal January 2016
ParILUT---A New Parallel Threshold ILU Factorization journal January 2018
Robust and Accurate Stopping Criteria for Adaptive Randomized Sampling in Matrix-Free Hierarchically Semiseparable Construction journal January 2019
A New Paradigm for Parallel Adaptive Meshing Algorithms journal January 2000
A New Paradigm for Parallel Adaptive Meshing Algorithms journal January 2003
Towards Extreme-Scale Simulations for Low Mach Fluids with Second-Generation Trilinos journal December 2014
An overview of the Trilinos project journal September 2005
A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization journal June 2016
Designing vector-friendly compact BLAS and LAPACK kernels
  • Kim, Kyungjoo; Costa, Timothy B.; Deveci, Mehmet
  • Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '17 https://doi.org/10.1145/3126908.3126941
conference January 2017
A communication-avoiding 3D sparse triangular solver conference January 2019
A Parallel Multigrid Preconditioned Conjugate Gradient Algorithm for Groundwater Flow Simulations journal September 1996
Ifpack2 User's Guide 1.0 report May 2016
ParILUT - A parallel threshold ILU for GPUS text January 2019

Cited By (1)

Toward Performance-Portable PETSc for GPU-based Exascale Systems preprint January 2020