Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A communication-avoiding 3D sparse triangular solver

Conference ·
We present a novel distributed memory algorithm to improve the strong scalability of the solution of a sparse triangular system. This operation appears in the solve phase of direct methods for solving general sparse linear systems, Ax = b. Our 3D sparse triangular solver employs several techniques, including a 3D MPI process grid, elimination tree parallelism, and data replication, all of which reduce the per-process communication when combined. We present analytical models to understand the communication cost of our algorithm and show that our 3D sparse triangular solver can reduce the per-process communication volume asymptotically by a factor of O(n1/4) and O(n1/6) for problems arising from the finite element discretizations of 2D "planar" and 3D "non-planar" PDEs, respectively. We implement our algorithm for use in SuperLU_DIST3D, using a hybrid MPI+OpenMP programming model. Our 3D triangular solve algorithm, when run on 12k cores of Cray XC30, outperforms the current state-of-the-art 2D algorithm by 7.2x for planar and 2.7x for the non-planar sparse matrices, respectively.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1558528
Country of Publication:
United States
Language:
English

References (16)

Efficient Parallel Sparse Triangular Solution Using Selective Inversion journal March 1998
Trading Replication for Communication in Parallel Distributed-Memory Dense Solvers journal March 2002
Structure-adaptive parallel solution of sparse triangular linear systems journal October 2014
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices conference May 2018
Communication results for parallel sparse Cholesky factorization on a hypercube journal May 1989
Avoiding communication in sparse matrix computations
  • Demmel, James; Hoemmen, Mark; Mohiyuddin, Marghoob
  • Distributed Processing Symposium (IPDPS), 2008 IEEE International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2008.4536305
conference April 2008
Integrated Model, Batch, and Domain Parallelism in Training Neural Networks conference July 2018
Highly scalable parallel algorithms for sparse matrix factorization journal May 1997
Nested Dissection of a Regular Finite Element Mesh journal April 1973
Communication-Avoiding Parallel Algorithms for Solving Triangular Systems of Linear Equations conference May 2017
Convergence Models and Surprising Results for the Asynchronous Jacobi Method conference May 2018
A New Data-Mapping Scheme for Latency-Tolerant Distributed Sparse Triangular Solution conference January 2002
Parallel Algorithms for Sparse Linear Systems journal September 1991
On asynchronous iterations journal November 2000
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication journal January 2016
Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix Multiplication conference May 2016

Similar Records

A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices
Conference · Tue May 01 00:00:00 EDT 2018 · OSTI ID:1544235

A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems
Journal Article · Sun Aug 18 20:00:00 EDT 2019 · Journal of Parallel and Distributed Computing · OSTI ID:1559632

Highly scalable distributed-memory sparse triangular solution algorithms.
Conference · Sun Dec 31 23:00:00 EST 2017 · OSTI ID:1602817

Related Subjects