Here, this paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures, specifically, those that are equipped with graphic processing units (GPUs). In addition to block-Jacobi, we present general purpose two-level ILU Schur complement-based approaches, where different strategies are presented to solve the coarse-level reduced system. These strategies are combined with modified ILU methods in the construction of the coarse-level operator, in order to effectively remove smooth errors by targeting an algebraically smooth vector. We leverage available GPU-based sparse matrix kernels to accelerate the setup and the solve phases of the proposed ILU preconditioner. We evaluate the efficiency of the proposed methods as a smoother for algebraic multigrid (AMG) and as a preconditioner for Krylov subspace methods on challenging anisotropic diffusion problems and a collection of general sparse matrices.
Xu, Tianshi, et al. "A two-level GPU-accelerated incomplete LU preconditioner for general sparse linear systems." International Journal of High Performance Computing Applications, vol. 39, no. 3, Feb. 2025. https://doi.org/10.1177/10943420251319334
Xu, Tianshi, Li, Rui Peng, & Osei-Kuffuor, Daniel (2025). A two-level GPU-accelerated incomplete LU preconditioner for general sparse linear systems. International Journal of High Performance Computing Applications, 39(3). https://doi.org/10.1177/10943420251319334
Xu, Tianshi, Li, Rui Peng, and Osei-Kuffuor, Daniel, "A two-level GPU-accelerated incomplete LU preconditioner for general sparse linear systems," International Journal of High Performance Computing Applications 39, no. 3 (2025), https://doi.org/10.1177/10943420251319334
@article{osti_2537970,
author = {Xu, Tianshi and Li, Rui Peng and Osei-Kuffuor, Daniel},
title = {A two-level GPU-accelerated incomplete LU preconditioner for general sparse linear systems},
annote = {Here, this paper presents a parallel preconditioning approach based on incomplete LU (ILU) factorizations in the framework of Domain Decomposition (DD) for general sparse linear systems. We focus on distributed memory parallel architectures, specifically, those that are equipped with graphic processing units (GPUs). In addition to block-Jacobi, we present general purpose two-level ILU Schur complement-based approaches, where different strategies are presented to solve the coarse-level reduced system. These strategies are combined with modified ILU methods in the construction of the coarse-level operator, in order to effectively remove smooth errors by targeting an algebraically smooth vector. We leverage available GPU-based sparse matrix kernels to accelerate the setup and the solve phases of the proposed ILU preconditioner. We evaluate the efficiency of the proposed methods as a smoother for algebraic multigrid (AMG) and as a preconditioner for Krylov subspace methods on challenging anisotropic diffusion problems and a collection of general sparse matrices.},
doi = {10.1177/10943420251319334},
url = {https://www.osti.gov/biblio/2537970},
journal = {International Journal of High Performance Computing Applications},
issn = {ISSN 1094-3420},
number = {3},
volume = {39},
place = {United States},
publisher = {SAGE},
year = {2025},
month = {02}}
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC52-07NA27344
OSTI ID:
2537970
Alternate ID(s):
OSTI ID: 2522848
Report Number(s):
LLNL--JRNL-813686; 1021773
Journal Information:
International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 3 Vol. 39; ISSN 1094-3420
Falgout, Robert D.; Yang, Ulrike Meier; Goos, Gerhard
Computational Science — ICCS 2002: International Conference Amsterdam, The Netherlands, April 21–24, 2002 Proceedings, Part IIIhttps://doi.org/10.1007/3-540-47789-6_66
2008 IEEE 11th International Conference on Computational Science and Engineering (CSE), 2008 11th IEEE International Conference on Computational Science and Engineeringhttps://doi.org/10.1109/CSE.2008.36
Rajamanickam, Sivasankaran; Boman, Erik G.; Heroux, Michael A.
2012 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), 2012 IEEE 26th International Parallel and Distributed Processing Symposiumhttps://doi.org/10.1109/IPDPS.2012.64