Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides

Journal Article · · Concurrency and Computation: Practice and Experience
DOI:https://doi.org/10.1002/cpe.4244· OSTI ID:1557091
 [1];  [2];  [3];  [3];  [1]
  1. University of Copenhagen
  2. BATTELLE (PACIFIC NW LAB)
  3. STFC Rutherford Appleton Laboratory, UK

The sparse triangular solve kernels, SpTRSV and SpTRSM, are important building blocks for a number of numerical linear algebra routines. Parallelizing SpTRSV and SpTRSM on today's many-core platforms, such as GPUs, is not an easy task since computing a component of the solution may depend on previously computed components, enforcing a degree of sequential processing. As a consequence, most existing work introduces a preprocessing stage to partition the components into a group of level-sets or colour-sets so that components within a set are independent and can be processed simultaneously during the subsequent solution stage. However, this class of methods requires a long preprocessing time as well as significant runtime synchronization over-heads between the sets. To address this, we

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1557091
Report Number(s):
PNNL-SA-130501
Journal Information:
Concurrency and Computation: Practice and Experience, Vol. 29, Issue 21
Country of Publication:
United States
Language:
English

References (17)

Iterative Methods for Sparse Linear Systems January 2003
Numerical Methods for Least Squares Problems January 1996
A Fast Dense Triangular Solve in CUDA January 2013
An overview of the sparse basic linear algebra subprograms: The new standard from the BLAS technical forum June 2002
Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors November 2015
A Cross-Platform SpMV Framework on Many-Core Architectures October 2016
A framework for general sparse matrix–matrix multiplication on GPUs and heterogeneous processors November 2015
GPU-accelerated preconditioned iterative linear solvers October 2012
Scaling synchronization in multicore programs October 2016
The design of MA48: a code for the direct solution of sparse unsymmetric linear systems of equations June 1996
The university of Florida sparse matrix collection November 2011
Solving Sparse Triangular Linear Systems on Parallel Computers May 1989
Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors January 1990
Parallel algorithms for solving linear systems with sparse triangular matrices September 2009
Sparse triangular solves for ILU revisited: data layout crucial to better performance December 2010
Structure-adaptive parallel solution of sparse triangular linear systems October 2014
Fine-Grained Parallel Incomplete LU Factorization January 2015