Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems
Journal Article
·
· Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences
- NVIDIA, Santa Clara, CA (United States)
- Univ. of Tennessee, Knoxville, TN (United States). Dept. of Electrical Engineering and Computer Science
- Univ. of Tennessee, Knoxville, TN (United States). Dept. of Electrical Engineering and Computer Science; Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Computer Science and Mathematics Division; Univ. of Manchester (United Kingdom). Dept. of Mathematics
- Univ. of Manchester (United Kingdom). Dept. of Mathematics
Double-precision floating-point arithmetic (FP64) has been the de facto standard for engineering and scientific simulations for several decades. Problem complexity and the sheer volume of data coming from various instruments and sensors motivate researchers to mix and match various approaches to optimize compute resources, including different levels of floating-point precision. In recent years, machine learning has motivated hardware support for half-precision floating-point arithmetic. A primary challenge in high-performance computing is to leverage reduced-precision and mixed-precision hardware. We show how the FP16/FP32 Tensor Cores on NVIDIA GPUs can be exploited to accelerate the solution of linear systems of equations Ax = b without sacrificing numerical stability. The techniques we employ include multiprecision LU factorization, the preconditioned generalized minimal residual algorithm (GMRES), and scaling and auto-adaptive rounding to avoid overflow. We also show how to efficiently handle systems with multiple right-hand sides. On the NVIDIA Quadro GV100 (Volta) GPU, we achieve a 4×-5× performance increase and 5× better energy efficiency versus the standard FP64 implementation while maintaining an FP64 level of numerical stability.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- OSTI ID:
- 1787013
- Journal Information:
- Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences, Journal Name: Proceedings of the Royal Society. A. Mathematical, Physical and Engineering Sciences Journal Issue: 2243 Vol. 476; ISSN 1364-5021
- Publisher:
- The Royal Society PublishingCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
FTTN: Feature-Targeted Testing for Numerical Properties of NVIDIA & AMD Matrix Accelerators
A GPU accelerated mixed-precision Smoothed Particle Hydrodynamics framework with cell-based relative coordinates
Analyzing Deep Learning Model Inferences for Image Classification using OpenVINO
Conference
·
Tue Oct 08 00:00:00 EDT 2024
·
OSTI ID:2539787
A GPU accelerated mixed-precision Smoothed Particle Hydrodynamics framework with cell-based relative coordinates
Journal Article
·
Sun Jan 28 19:00:00 EST 2024
· Engineering Analysis with Boundary Elements
·
OSTI ID:2283916
Analyzing Deep Learning Model Inferences for Image Classification using OpenVINO
Conference
·
Tue Dec 31 23:00:00 EST 2019
·
OSTI ID:1804060