skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Mixed precision s–step Lanczos and conjugate gradient algorithms

Journal Article · · Numerical Linear Algebra with Applications
DOI:https://doi.org/10.1002/nla.2425· OSTI ID:1834328

Abstract Compared to the classical Lanczos algorithm, the s ‐step Lanczos variant has the potential to improve performance by asymptotically decreasing the synchronization cost per iteration. However, this comes at a price; despite being mathematically equivalent, the s ‐step variant may behave quite differently in finite precision, potentially exhibiting greater loss of accuracy and slower convergence relative to the classical algorithm. It has previously been shown that the errors in the s ‐step version follow the same structure as the errors in the classical algorithm, but are amplified by a factor depending on the square of the condition number of the ‐dimensional Krylov bases computed in each outer loop. As the condition number of these s ‐step bases grows (in some cases very quickly) with s , this limits the s values that can be chosen and thus can limit the attainable performance. In this work, we show that if a select few computations in s ‐step Lanczos are performed in double the working precision, the error terms then depend only linearly on the conditioning of the s ‐step bases. This has the potential for drastically improving the numerical behavior of the algorithm with little impact on per‐iteration performance. Our numerical experiments demonstrate the improved numerical behavior possible with the mixed precision approach, and also show that this improved behavior extends to mixed precision s ‐step CG. We present preliminary performance results on NVIDIA V100 GPUs that show that the overhead of extra precision is minimal if one uses precisions implemented in hardware.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
NA0003525; 17‐SC‐20‐SC; DE‐NA0003525
OSTI ID:
1834328
Alternate ID(s):
OSTI ID: 1832252
Report Number(s):
SAND-2021-14314J; 701479
Journal Information:
Numerical Linear Algebra with Applications, Vol. 29, Issue 3; ISSN 1070-5325
Publisher:
WileyCopyright Statement
Country of Publication:
United States
Language:
English

References (20)

The Adaptive $s$-Step Conjugate Gradient Method journal January 2018
An iteration method for the solution of the eigenvalue problem of linear differential and integral operators journal October 1950
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns journal December 2014
Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem journal December 1980
Error Analysis of the Lanczos Algorithm for Tridiagonalizing a Symmetric Matrix journal January 1976
s-step iterative methods for symmetric linear systems journal February 1989
Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm journal July 2014
The Lanczos and conjugate gradient algorithms in finite precision arithmetic journal May 2006
Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences journal February 1989
An adaptive $s$-step conjugate gradient algorithm with dynamic basis updating [english] journal February 2020
Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives conference May 2017
Solution of systems of linear equations by minimized iterations journal July 1952
Communication lower bounds and optimal algorithms for numerical linear algebra journal May 2014
A Newton basis GMRES implementation journal January 1994
Accuracy of the $s$-Step Lanczos Method for the Symmetric Eigenproblem in Finite Precision journal January 2015
The university of Florida sparse matrix collection journal November 2011
Methods of conjugate gradients for solving linear systems journal December 1952
Avoiding communication in sparse matrix computations
  • Demmel, James; Hoemmen, Mark; Mohiyuddin, Marghoob
  • Distributed Processing Symposium (IPDPS), 2008 IEEE International Symposium on Parallel and Distributed Processing https://doi.org/10.1109/IPDPS.2008.4536305
conference April 2008
A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of $s$-Step Krylov Subspace Methods journal January 2014
Parallelizable restarted iterative methods for nonsymmetric linear systems. part I: Theory journal January 1992