A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

Abdelfattah, Ahmad; Anzt, Hartwig; Boman, Erik G.; Carson, Erin; Cojean, Terry; Dongarra, Jack; Fox, Alyson; Gates, Mark; Higham, Nicholas J.; Li, Xiaoye S.; Loe, Jennifer; Luszczek, Piotr; Pranesh, Srikara; Rajamanickam, Siva; Ribizel, Tobias; Smith, Barry F.; Swirydowicz, Kasia; Thomas, Stephen; Tomov, Stanimire; Tsai, Yaohung M.; Yang, Ulrike Meier

doi:10.1177/10943420211003313

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

Journal Article · Fri Mar 19 00:00:00 EDT 2021 · International Journal of High Performance Computing Applications

DOI:https://doi.org/10.1177/10943420211003313· OSTI ID:1825849

Abdelfattah, Ahmad ^[1]; ^[2]; Boman, Erik G. ^[3]; Carson, Erin ^[4]; Cojean, Terry ^[5]; Dongarra, Jack ^[6]; Fox, Alyson ^[7]; Gates, Mark ^[1]; Higham, Nicholas J. ^[8]; Li, Xiaoye S. ^[9]; ^[3]; Luszczek, Piotr ^[1]; Pranesh, Srikara ^[8]; Rajamanickam, Siva ^[3]; ^[5]; Smith, Barry F. ^[10]; Swirydowicz, Kasia ^[11]; Thomas, Stephen ^[11]; Tomov, Stanimire ^[1]; Tsai, Yaohung M. ^[1] more »

Univ. of Tennessee, Knoxville, TN (United States)
Univ. of Tennessee, Knoxville, TN (United States); Karlsruhe Inst. of Technology (KIT) (Germany)
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Charles Univ., Prague (Czech Republic)
Karlsruhe Inst. of Technology (KIT) (Germany)
Univ. of Tennessee, Knoxville, TN (United States); Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Univ. of Manchester (United Kingdom)
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Univ. of Manchester (United Kingdom)
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
Argonne National Lab. (ANL), Argonne, IL (United States)
National Renewable Energy Lab. (NREL), Boulder, CO (United States)

The efficient utilization of mixed-precision numerical linear algebra algorithms can offer attractive acceleration to scientific computing applications. Especially with the hardware integration of low-precision special-function units designed for machine learning applications, the traditional numerical algorithms community urgently needs to reconsider the floating point formats used in the distinct operations to efficiently leverage the available compute power. In this study, we provide a comprehensive survey of mixed-precision numerical linear algebra routines, including the underlying concepts, theoretical background, and experimental results for both dense and sparse linear algebra problems.

View Accepted Manuscript (DOE)

Research Organization:: Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States)

Sponsoring Organization:: USDOE National Nuclear Security Administration (NNSA)

Grant/Contract Number:: AC52-07NA27344

OSTI ID:: 1825849

Report Number(s):: LLNL-JRNL--826451; 1041053

Journal Information:: International Journal of High Performance Computing Applications, Journal Name: International Journal of High Performance Computing Applications Journal Issue: 4 Vol. 35; ISSN 1094-3420

Publisher:: SAGECopyright Statement

Country of Publication:: United States

Language:: English

References (63)

Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers: Adaptive precision in block-Jacobi preconditioning for iterative solvers Anzt, Hartwig; Dongarra, Jack; Flegar, Goran Concurrency and Computation: Practice and Experience, Vol. 31, Issue 6 https://doi.org/10.1002/cpe.4460	journal	March 2018
Gram-Schmidt orthogonalization: 100 years and more: GRAM-SCHMIDT ORTHOGONALIZATION: 100 YEARS AND MORE Leon, Steven J.; Björck, Åke; Gander, Walter Numerical Linear Algebra with Applications, Vol. 20, Issue 3 https://doi.org/10.1002/nla.1839	journal	June 2012
Solving linear least squares problems by Gram-Schmidt orthogonalization Björck, Åke BIT, Vol. 7, Issue 1 https://doi.org/10.1007/BF01934122	journal	March 1967
Iterative refinement of linear least squares solutions I Björck, Åke BIT, Vol. 7, Issue 4 https://doi.org/10.1007/BF01939321	journal	December 1967
Reliable updated residuals in hybrid Bi-CG methods Sleijpen, G. L. G.; van der Vorst, H. A. Computing, Vol. 56, Issue 2 https://doi.org/10.1007/BF02309342	journal	June 1996
A note on the error analysis of classical Gram–Schmidt Smoktunowicz, Alicja; Barlow, Jesse L.; Langou, Julien Numerische Mathematik, Vol. 105, Issue 2 https://doi.org/10.1007/s00211-006-0042-1	journal	November 2006
Scaling linear optimization problems prior to application of the simplex method Elble, Joseph M.; Sahinidis, Nikolaos V. Computational Optimization and Applications, Vol. 52, Issue 2 https://doi.org/10.1007/s10589-011-9420-4	journal	July 2011
Iterative refinement for symmetric eigenvalue decomposition Ogita, Takeshi; Aishima, Kensuke Japan Journal of Industrial and Applied Mathematics, Vol. 35, Issue 3 https://doi.org/10.1007/s13160-018-0310-3	journal	May 2018
Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues Ogita, Takeshi; Aishima, Kensuke Japan Journal of Industrial and Applied Mathematics, Vol. 36, Issue 2 https://doi.org/10.1007/s13160-019-00348-4	journal	February 2019
Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects Alvermann, Andreas; Basermann, Achim; Bungartz, Hans-Joachim Japan Journal of Industrial and Applied Mathematics, Vol. 36, Issue 2 https://doi.org/10.1007/s13160-019-00360-8	journal	April 2019
Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem Paige, C. C. Linear Algebra and its Applications, Vol. 34 https://doi.org/10.1016/0024-3795(80)90167-6	journal	December 1980
Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences Greenbaum, A. Linear Algebra and its Applications, Vol. 113 https://doi.org/10.1016/0024-3795(89)90285-1	journal	February 1989
Numerics of Gram-Schmidt orthogonalization Björck, Å. Linear Algebra and its Applications, Vol. 197-198 https://doi.org/10.1016/0024-3795(94)90493-6	journal	January 1994
Solving lattice QCD systems of equations using mixed precision solvers on GPUs Clark, M. A.; Babich, R.; Barros, K. Computer Physics Communications, Vol. 181, Issue 9 https://doi.org/10.1016/j.cpc.2010.05.002	journal	September 2010
Incomplete Sparse Approximate Inverses for Parallel Preconditioning Anzt, Hartwig; Huckle, Thomas K.; Bräckle, Jürgen Parallel Computing, Vol. 71 https://doi.org/10.1016/j.parco.2017.10.003	journal	January 2018
GPU Acceleration of a Non-hydrostatic Ocean Model with a Multigrid Poisson/Helmholtz solver Yamagishi, Takateru; Matsumura, Yoshimasa Procedia Computer Science, Vol. 80 https://doi.org/10.1016/j.procs.2016.05.502	journal	January 2016
The Lanczos and conjugate gradient algorithms in finite precision arithmetic Meurant, Gérard; Strakoš, Zdeněk Acta Numerica, Vol. 15 https://doi.org/10.1017/S096249290626001X	journal	May 2006
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects Agullo, Emmanuel; Demmel, Jim; Dongarra, Jack Journal of Physics: Conference Series, Vol. 180 https://doi.org/10.1088/1742-6596/180/1/012037	journal	July 2009
Iterative refinement implies numerical stability for Gaussian elimination Skeel, Robert D. Mathematics of Computation, Vol. 35, Issue 151 https://doi.org/10.1090/S0025-5718-1980-0572859-4	journal	September 1980
Iterative refinement for linear systems and LAPACK Higham, N. IMA Journal of Numerical Analysis, Vol. 17, Issue 4 https://doi.org/10.1093/imanum/17.4.495	journal	October 1997
Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs Abdelfattah, Ahmad; Tomov, Stanimire; Dongarra, Jack 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) https://doi.org/10.1109/IPDPS.2019.00022	conference	May 2019
Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems) Langou, Julie; Langou, Julien; Luszczek, Piotr SC 2006 Proceedings Supercomputing 2006, ACM/IEEE SC 2006 Conference (SC'06) https://doi.org/10.1109/SC.2006.30	conference	November 2006
Efficiency and scalability of two parallel QR factorization algorithms Malard, J.; Paige, C. C. Proceedings of IEEE Scalable High Performance Computing Conference https://doi.org/10.1109/SHPCC.1994.296698	conference	January 1994
Modified Gram-Schmidt (MGS), Least Squares, and Backward Stability of MGS-GMRES Paige, Christopher C.; Rozlozník, Miroslav; Strakos, Zdenvek SIAM Journal on Matrix Analysis and Applications, Vol. 28, Issue 1 https://doi.org/10.1137/050630416	journal	January 2006
A Note on GMRES Preconditioned by a Perturbed $L D L^T$ Decomposition with Static Pivoting Arioli, M.; Duff, I. S.; Gratton, S. SIAM Journal on Scientific Computing, Vol. 29, Issue 5 https://doi.org/10.1137/060661545	journal	January 2007
Loss and Recapture of Orthogonality in the Modified Gram–Schmidt Algorithm Björck, Å.; Paige, C. C. SIAM Journal on Matrix Analysis and Applications, Vol. 13, Issue 1 https://doi.org/10.1137/0613015	journal	January 1992
Improving the Accuracy of Computed Eigenvalues and Eigenvectors Dongarra, J. J.; Moler, C. B.; Wilkinson, J. H. SIAM Journal on Numerical Analysis, Vol. 20, Issue 1 https://doi.org/10.1137/0720002	journal	February 1983
The Accuracy of Solutions to Triangular Systems Higham, Nicholas J. SIAM Journal on Numerical Analysis, Vol. 26, Issue 5 https://doi.org/10.1137/0726070	journal	October 1989
GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems Saad, Youcef; Schultz, Martin H. SIAM Journal on Scientific and Statistical Computing, Vol. 7, Issue 3 https://doi.org/10.1137/0907058	journal	July 1986
Implementation of the GMRES Method Using Householder Transformations Walker, Homer F. SIAM Journal on Scientific and Statistical Computing, Vol. 9, Issue 1 https://doi.org/10.1137/0909010	journal	January 1988
Modification of the Householder Method Based on the Compact WY Representation Puglisi, Chiara SIAM Journal on Scientific and Statistical Computing, Vol. 13, Issue 3 https://doi.org/10.1137/0913042	journal	May 1992
Accuracy and Stability of Numerical Algorithms Higham, Nicholas J. https://doi.org/10.1137/1.9780898718027	book	January 2002
A Symmetry Preserving Algorithm for Matrix Scaling Knight, Philip A.; Ruiz, Daniel; Uçar, Bora SIAM Journal on Matrix Analysis and Applications, Vol. 35, Issue 3 https://doi.org/10.1137/110825753	journal	January 2014
Properties of a Unitary Matrix Obtained from a Sequence of Normalized Vectors Paige, Christopher C.; Wülling, Wolfgang SIAM Journal on Matrix Analysis and Applications, Vol. 35, Issue 2 https://doi.org/10.1137/120897687	journal	January 2014
Improved Accuracy and Parallelism for MRRR-Based Eigensolvers---A Mixed Precision Approach Petschow, M.; Quintana-Ortí, E. S.; Bientinesi, P. SIAM Journal on Scientific Computing, Vol. 36, Issue 2 https://doi.org/10.1137/130911561	journal	January 2014
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs Yamazaki, Ichitaro; Tomov, Stanimire; Dongarra, Jack SIAM Journal on Scientific Computing, Vol. 37, Issue 3 https://doi.org/10.1137/14M0973773	journal	January 2015
A New Analysis of Iterative Refinement and Its Application to Accurate Solution of Ill-Conditioned Sparse Linear Systems Carson, Erin; Higham, Nicholas J. SIAM Journal on Scientific Computing, Vol. 39, Issue 6 https://doi.org/10.1137/17M1122918	journal	January 2017
Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions Carson, Erin; Higham, Nicholas J. SIAM Journal on Scientific Computing, Vol. 40, Issue 2 https://doi.org/10.1137/17M1140819	journal	January 2018
Block Modified Gram--Schmidt Algorithms and Their Analysis Barlow, Jesse L. SIAM Journal on Matrix Analysis and Applications, Vol. 40, Issue 4 https://doi.org/10.1137/18M1197400	journal	January 2019
Shifted Cholesky QR for Computing the QR Factorization of Ill-Conditioned Matrices Fukaya, Takeshi; Kannan, Ramaseshan; Nakatsukasa, Yuji SIAM Journal on Scientific Computing, Vol. 42, Issue 1 https://doi.org/10.1137/18M1218212	journal	January 2020
A New Approach to Probabilistic Rounding Error Analysis Higham, Nicholas J.; Mary, Theo SIAM Journal on Scientific Computing, Vol. 41, Issue 5 https://doi.org/10.1137/18M1226312	journal	January 2019
Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems Higham, Nicholas J.; Pranesh, Srikara; Zounon, Mawussi SIAM Journal on Scientific Computing, Vol. 41, Issue 4 https://doi.org/10.1137/18M1229511	journal	January 2019
Simulating Low Precision Floating-Point Arithmetic Higham, Nicholas J.; Pranesh, Srikara SIAM Journal on Scientific Computing, Vol. 41, Issue 5 https://doi.org/10.1137/19M1251308	journal	January 2019
Mixed Precision Block Fused Multiply-Add: Error Analysis and Application to GPU Tensor Cores Blanchard, Pierre; Higham, Nicholas J.; Lopez, Florent SIAM Journal on Scientific Computing, Vol. 42, Issue 3 https://doi.org/10.1137/19M1289546	journal	January 2020
Analysis of the Cholesky Method with Iterative Refinement for Solving the Symmetric Definite Generalized Eigenproblem Davies, Philip I.; Higham, Nicholas J.; Tisseur, Françoise SIAM Journal on Matrix Analysis and Applications, Vol. 23, Issue 2 https://doi.org/10.1137/S0895479800373498	journal	January 2001
Inexact Krylov Subspace Methods for Linear Systems van den Eshof, Jasper; Sleijpen, Gerard L. G. SIAM Journal on Matrix Analysis and Applications, Vol. 26, Issue 1 https://doi.org/10.1137/S0895479802403459	journal	January 2004
A Rank- k Update Procedure for Reorthogonalizing the Orthogonal Factor from Modified Gram--Schmidt Giraud, Luc; Gratton, Serge; Langou, Julien SIAM Journal on Matrix Analysis and Applications, Vol. 25, Issue 4 https://doi.org/10.1137/S0895479803424347	journal	January 2004
Estimating the Attainable Accuracy of Recursively Computed Residual Methods Greenbaum, Anne SIAM Journal on Matrix Analysis and Applications, Vol. 18, Issue 3 https://doi.org/10.1137/S0895479895284944	journal	July 1997
Newton's Method in Floating Point Arithmetic and Iterative Refinement of Generalized Eigenvalue Problems Tisseur, Françoise SIAM Journal on Matrix Analysis and Applications, Vol. 22, Issue 4 https://doi.org/10.1137/S0895479899359837	journal	January 2001
Residual and Backward Error Bounds in Minimum Residual Krylov Subspace Methods Paige, Christopher C.; Strakos, Zdenvek SIAM Journal on Scientific Computing, Vol. 23, Issue 6 https://doi.org/10.1137/S1064827500381239	journal	January 2002
Theory of Inexact Krylov Subspace Methods and Applications to Scientific Computing Simoncini, Valeria; Szyld, Daniel B. SIAM Journal on Scientific Computing, Vol. 25, Issue 2 https://doi.org/10.1137/S1064827502406415	journal	January 2003
Residual Replacement Strategies for Krylov Subspace Iterative Methods for the Convergence of True Residuals van der Vorst, Henk A.; Ye, Qiang SIAM Journal on Scientific Computing, Vol. 22, Issue 3 https://doi.org/10.1137/S1064827599353865	journal	January 2000
Accumulating Householder transformations, revisited Joffrain, Thierry; Low, Tze Meng; Quintana-Ortí, Enrique S. ACM Transactions on Mathematical Software, Vol. 32, Issue 2 https://doi.org/10.1145/1141885.1141886	journal	June 2006
Error bounds from extra-precise iterative refinement Demmel, James; Hida, Yozo; Kahan, William ACM Transactions on Mathematical Software, Vol. 32, Issue 2 https://doi.org/10.1145/1141885.1141894	journal	June 2006
A fast and robust mixed-precision solver for the solution of sparse symmetric linear systems Hogg, J. D.; Scott, J. A. ACM Transactions on Mathematical Software, Vol. 37, Issue 2 https://doi.org/10.1145/1731022.1731027	journal	April 2010
Mixed-Precision AMG method for Many Core Accelerators Sumiyoshi, Yuki; Fujii, Akihiro; Nukada, Akira EuroMPI/ASIA '14: 21st European MPI Users' Group Meeting, Proceedings of the 21st European MPI Users' Group Meeting https://doi.org/10.1145/2642769.2642794	conference	September 2014
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU Yamazaki, Ichitaro; Tomov, Stanimire; Dongarra, Jack ACM Transactions on Mathematical Software, Vol. 43, Issue 2 https://doi.org/10.1145/2898347	journal	September 2016
Investigating half precision arithmetic to accelerate dense linear system solvers Haidar, Azzam; Wu, Panruo; Tomov, Stanimire SC '17: The International Conference for High Performance Computing, Networking, Storage and Analysis, Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems https://doi.org/10.1145/3148226.3148237	conference	November 2017
Iterative Refinement in Floating Point Moler, Cleve B. Journal of the ACM, Vol. 14, Issue 2 https://doi.org/10.1145/321386.321394	journal	April 1967
Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software Flegar, Goran; Anzt, Hartwig; Cojean, Terry ACM Transactions on Mathematical Software, Vol. 47, Issue 2 https://doi.org/10.1145/3441850	journal	April 2021
Algorithm 589: SICEDR : A FORTRAN Subroutine for Improving the Accuracy of Computed Matrix Eigenvalues Dongarra, Jack J. ACM Transactions on Mathematical Software, Vol. 8, Issue 4 https://doi.org/10.1145/356012.356016	journal	December 1982
Toward a modular precision ecosystem for high-performance computing Anzt, Hartwig; Flegar, Goran; Grützmacher, Thomas The International Journal of High Performance Computing Applications, Vol. 33, Issue 6 https://doi.org/10.1177/1094342019846547	journal	May 2019
Ginkgo: A high performance numerical linear algebra library Anzt, Hartwig; Cojean, Terry; Chen, Yen-Chen Journal of Open Source Software, Vol. 5, Issue 52 https://doi.org/10.21105/joss.02260	journal	August 2020

Similar Records

Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems

Journal Article · Tue Nov 24 23:00:00 EST 2020 · Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences · OSTI ID:1787013

On the performance and energy efficiency of sparse linear algebra on GPUs

Journal Article · Wed Oct 05 00:00:00 EDT 2016 · International Journal of High Performance Computing Applications · OSTI ID:1437692

Related Subjects

97 MATHEMATICS AND COMPUTING
GPUs
Mixed-precision arithmetic
high-performance computing
linear algebra
numerical mathematics

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

Citation Formats

References (63)

Similar Records

Related Subjects