skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

Book ·

This chapter presents the implementation of a batched CUDA solver based on LU factorization for small linear systems. This solver may be used in applications such as reactive flow transport models, which apply the Newton-Raphson technique to linearize and iteratively solve the sets of non linear equations that represent the reactions for ten of thousands to millions of physical locations. The implementation exploits somewhat counterintuitive GPGPU programming techniques: it assigns the solution of a matrix (representing a system) to a single CUDA thread, does not exploit shared memory and employs dynamic memory allocation on the GPUs. These techniques enable our implementation to simultaneously solve sets of systems with over 100 equations and to employ LU decomposition with complete pivoting, providing the higher numerical accuracy required by certain applications. Other currently available solutions for batched linear solvers are limited by size and only support partial pivoting, although they may result faster in certain conditions. We discuss the code of our implementation and present a comparison with the other implementations, discussing the various tradeoffs in terms of performance and flexibility. This work will enable developers that need batched linear solvers to choose whichever implementation is more appropriate to the features and the requirements of their applications, and even to implement dynamic switching approaches that can choose the best implementation depending on the input data.

Research Organization:
Pacific Northwest National Lab. (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1179528
Report Number(s):
PNNL-SA-100792
Resource Relation:
Related Information: Numerical Computations with GPUs, 87-101
Country of Publication:
United States
Language:
English

Similar Records

Accelerating Subsurface Transport Simulation on Heterogeneous Clusters
Conference · Mon Sep 23 00:00:00 EDT 2013 · OSTI ID:1179528

Power/Performance Trade-offs of Small Batched LU Based Solvers on GPUs
Conference · Mon Aug 26 00:00:00 EDT 2013 · OSTI ID:1179528

Performance Portable Batched Sparse Linear Solvers
Journal Article · Mon May 01 00:00:00 EDT 2023 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1179528