Quantifying the Impact of Single Bit Flips on Floating Point Arithmetic

Elliott, James J; Mueller, Frank; Stoyanov, Miroslav K; Webster, Clayton G

doi:10.2172/1089338

Title: Quantifying the Impact of Single Bit Flips on Floating Point Arithmetic

Technical Report · Thu Aug 01 00:00:00 EDT 2013

DOI:https://doi.org/10.2172/1089338· OSTI ID:1089338

Elliott, James J ^[1]; Mueller, Frank ^[2]; Stoyanov, Miroslav K ^[1]; Webster, Clayton G ^[1]

ORNL
North Carolina State University

In high-end computing, the collective surface area, smaller fabrication sizes, and increasing density of components have led to an increase in the number of observed bit flips. If mechanisms are not in place to detect them, such flips produce silent errors, i.e. the code returns a result that deviates from the desired solution by more than the allowed tolerance and the discrepancy cannot be distinguished from the standard numerical error associated with the algorithm. These phenomena are believed to occur more frequently in DRAM, but logic gates, arithmetic units, and other circuits are also susceptible to bit flips. Previous work has focused on algorithmic techniques for detecting and correcting bit flips in specific data structures, however, they suffer from lack of generality and often times cannot be implemented in heterogeneous computing environment. Our work takes a novel approach to this problem. We focus on quantifying the impact of a single bit flip on specific floating-point operations. We analyze the error induced by flipping specific bits in the most widely used IEEE floating-point representation in an architecture-agnostic manner, i.e., without requiring proprietary information such as bit flip rates and the vendor-specific circuit designs. We initially study dot products of vectors and demonstrate that not all bit flips create a large error and, more importantly, expected value of the relative magnitude of the error is very sensitive on the bit pattern of the binary representation of the exponent, which strongly depends on scaling. Our results are derived analytically and then verified experimentally with Monte Carlo sampling of random vectors. Furthermore, we consider the natural resilience properties of solvers based on the fixed point iteration and we demonstrate how the resilience of the Jacobi method for linear equations can be significantly improved by rescaling the associated matrix.

View Technical Report

Cite

Export

Save

Research Organization:: Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

Sponsoring Organization:: USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE Office of Science (SC)

DOE Contract Number:: DE-AC05-00OR22725

OSTI ID:: 1089338

Report Number(s):: ORNL/TM-2013/282; KJ0402000; ERKJU16

Country of Publication:: United States

Language:: English

Similar Records

Implementation of Jacobi rotations for accurate singular value computation in floating point arithmetic

Journal Article · Tue Jul 01 00:00:00 EDT 1997 · SIAM Journal on Scientific Computing · OSTI ID:1089338

Drmac, Z

A pipelined 50-MHz CMOS 64-bit floating-point arithmetic processor

Journal Article · Sun Oct 01 00:00:00 EDT 1989 · IEEE Journal of Solid-State Circuits (Institute of Electrical and Electronics Engineers); (USA) · OSTI ID:1089338

Benschneider, B J; Bowhill, W J; Cooper, E M; +6 more

Finite-precision arithmetic in singular-value decomposition architectures

Thesis/Dissertation · Thu Jan 01 00:00:00 EST 1987 · OSTI ID:1089338

Duryea, R A

Title: Quantifying the Impact of Single Bit Flips on Floating Point Arithmetic

Citation Formats

Similar Records

Related Subjects