A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

Tumeo, Antonino; Gawande, Nitin A.; Villa, Oreste

doi:10.1007/978-3-319-06548-9_5

Title: A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

Book · Mon Jun 09 00:00:00 EDT 2014

DOI:https://doi.org/10.1007/978-3-319-06548-9_5· OSTI ID:1179528

Tumeo, Antonino; Gawande, Nitin A.; Villa, Oreste

This chapter presents the implementation of a batched CUDA solver based on LU factorization for small linear systems. This solver may be used in applications such as reactive flow transport models, which apply the Newton-Raphson technique to linearize and iteratively solve the sets of non linear equations that represent the reactions for ten of thousands to millions of physical locations. The implementation exploits somewhat counterintuitive GPGPU programming techniques: it assigns the solution of a matrix (representing a system) to a single CUDA thread, does not exploit shared memory and employs dynamic memory allocation on the GPUs. These techniques enable our implementation to simultaneously solve sets of systems with over 100 equations and to employ LU decomposition with complete pivoting, providing the higher numerical accuracy required by certain applications. Other currently available solutions for batched linear solvers are limited by size and only support partial pivoting, although they may result faster in certain conditions. We discuss the code of our implementation and present a comparison with the other implementations, discussing the various tradeoffs in terms of performance and flexibility. This work will enable developers that need batched linear solvers to choose whichever implementation is more appropriate to the features and the requirements of their applications, and even to implement dynamic switching approaches that can choose the best implementation depending on the input data.

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Cite

Export

Save

Research Organization:: Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1179528

Report Number(s):: PNNL-SA-100792

Resource Relation:: Related Information: Numerical Computations with GPUs, 87-101

Country of Publication:: United States

Language:: English

Similar Records

Accelerating Subsurface Transport Simulation on Heterogeneous Clusters

Conference · Mon Sep 23 00:00:00 EDT 2013 · OSTI ID:1179528

Villa, Oreste; Gawande, Nitin A.; Tumeo, Antonino

Power/Performance Trade-offs of Small Batched LU Based Solvers on GPUs

Conference · Mon Aug 26 00:00:00 EDT 2013 · OSTI ID:1179528

Villa, Oreste; Fatica, Massimiliano; Gawande, Nitin A.; +1 more

Performance Portable Batched Sparse Linear Solvers

Journal Article · Mon May 01 00:00:00 EDT 2023 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1179528

Liegeois, Kim; Rajamanickam, Sivasankaran; Berger-Vergiat, Luc

Related Subjects

GPUs
LU decomposition
bached linear solvers

Title: A Flexible CUDA LU-based Solver for Small, Batched Linear Systems

Citation Formats

Similar Records

Related Subjects