skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Highly scalable distributed-memory sparse triangular solution algorithms.

Conference ·
 [1];  [1];  [1];  [1]
  1. Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Scalable Solvers Group

This paper presents a highly efficient distributed-memory parallel sparse triangular solver. The triangular solution phase is often performed following factorization phase in the sparse linear solvers and has become increasingly computa­tionally expensive for direct solvers with many right hand sides (RHSs) or preconditioned iterative solvers. However, the low arithmetic intensity and sequential nature of the triangular solve algorithm pose performance challenges for its large-scale distributed-memory parallelization. In this work, we propose several strategies to enhance scalability of an algorithm with 2D block cyclic process layout. First, an asynchronous binary-tree-based communication scheme implemented via non-blocking MPI functions is leveraged to broadcast partial solutions and reduce partial updates among a subset of processes for each block column and row of the triangular matrix, respectively. This scheme reduces message latency, improves communication load balance and significantly accelerates asynchronous execution of the triangular solve. In addition, efficient BLAS operations and threading implementations are exploited to accelerate local computations and further reduce process idle time. The proposed strategies are implemented in SuperLU_DIST and numerical experiments show up to 4.4x improvement with one right-hand side and up to 6.1x improvement with 50 right- hand sides on 4096 processes, compared to the current release. This is the first time that sparse triangular solution is demonstrated strong scaling on more than 4000 cores.

Research Organization:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC02-05CH11231
OSTI ID:
1602817
Resource Relation:
Conference: SIAM Workshop on Combinatorial Scientific Computing, June 6-8, 2018, Bergen, Norway
Country of Publication:
United States
Language:
English

Similar Records

A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems
Journal Article · Mon Aug 19 00:00:00 EDT 2019 · Journal of Parallel and Distributed Computing · OSTI ID:1602817

An asynchronous parallel linear equation solution technique
Conference · Sun Dec 31 00:00:00 EST 1995 · OSTI ID:1602817

A communication-avoiding 3D sparse triangular solver
Conference · Sat Jun 01 00:00:00 EDT 2019 · OSTI ID:1602817

Related Subjects