Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Extending Moore's Law via Computationally Error Tolerant Computing.

Journal Article · · ACM Transactions on Architecture and Code Optimization
DOI:https://doi.org/10.1145/3177837· OSTI ID:1432788
 [1];  [1];  [1];  [1];  [2];  [2];  [2]
  1. Georgia Inst. of Technology, Atlanta, GA (United States)
  2. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Dennard scaling has ended. Lowering the voltage supply (Vdd) to sub-volt levels causes intermittent losses in signal integrity, rendering further scaling (down) no longer acceptable as a means to lower the power required by a processor core. However, it is possible to correct the occasional errors caused due to lower Vdd in an efficient manner and effectively lower power. By deploying the right amount and kind of redundancy, we can strike a balance between overhead incurred in achieving reliability and energy savings realized by permitting lower Vdd. One promising approach is the Redundant Residue Number System (RRNS) representation. Unlike other error correcting codes, RRNS has the important property of being closed under addition, subtraction and multiplication, thus enabling computational error correction at a fraction of an overhead compared to conventional approaches. We use the RRNS scheme to design a Computationally-Redundant, Energy-Efficient core, including the microarchitecture, Instruction Set Architecture (ISA) and RRNS centered algorithms. Finally, from the simulation results, this RRNS system can reduce the energy-delay-product by about 3× for multiplication intensive workloads and by about 2× in general, when compared to a non-error-correcting binary core.
Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC04-94AL85000
OSTI ID:
1432788
Report Number(s):
SAND--2018-0257J; 659857
Journal Information:
ACM Transactions on Architecture and Code Optimization, Journal Name: ACM Transactions on Architecture and Code Optimization Journal Issue: 1 Vol. 15; ISSN 1544-3566
Publisher:
Association for Computing MachineryCopyright Statement
Country of Publication:
United States
Language:
English

References (52)

AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware book January 2009
ANB- and ANBDmem-Encoding: Detecting Hardware Errors in Software book January 2010
Preserving Hamming Distance in Arithmetic and Logical Operations journal November 2013
A method of monitoring execution of arithmetic operations on computers in computerized monitoring and measuring systems journal March 2008
Fast RNS division algorithms for fixed divisors with application to RSA encryption journal August 1994
Floating-point numbers in residue number systems journal January 1991
RNS-enabled digital signal processor design journal January 2002
New fault tolerant techniques for residue number systems journal January 1992
A systolic redundant residue arithmetic error correction circuit journal April 1993
Fast combinatorial RNS processors for DSP applications journal May 1995
A new residue arithmetic error correction scheme journal January 1996
An algorithm for scaling and single residue error correction in residue number systems journal January 1990
Efficient use of time and hardware redundancy for concurrent error detection in a 32-bit VLSI adder journal February 1988
RNS-to-binary conversion for efficient VLSI implementation
  • Cardarilli, G. C.; Re, M.; Lojacono, R.
  • IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, Vol. 45, Issue 6 https://doi.org/10.1109/81.678485
journal June 1998
A coding theory approach to error control in redundant residue number systems. I. Theory and single error correction
  • Krishna, H.; Lin, K. -Y.; Sun, J. -D.
  • IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol. 39, Issue 1 https://doi.org/10.1109/82.204106
journal January 1992
Self Checking in Current Floating-Point Units conference July 2011
Multi-fault Attack Detection for RNS Cryptographic Architecture conference July 2016
New Self-checking Output-Duplicated Booth Multiplier with High Fault Coverage for Soft Errors conference January 2005
Arbitrary Error Detection in Combinational Circuits by Using Partitioning
  • Keren, Osnat; Levin, Ilya; Ostrovsky, Vladimir
  • 2008 23rd IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, 2008 IEEE International Symposium on Defect and Fault Tolerance of VLSI Systems https://doi.org/10.1109/DFT.2008.34
conference October 2008
Time redundant error correcting adders and multipliers conference January 1992
DeCoR: A Delayed Commit and Rollback mechanism for handling inductive noise in processors conference February 2008
A new algorithm for single error correction In RRNS conference November 2013
Fault tolerant asynchronous adder through dynamic self-reconfiguration conference January 2005
Cost effective soft error mitigation for parallel adders by exploiting inherent redundancy conference June 2010
Multiple Bit Error Detection and Correction in GF Arithmetic Circuits conference December 2010
In Quest of the “Next Switch”: Prospects for Greatly Reduced Power Dissipation in a Successor to the Silicon Field-Effect Transistor journal December 2010
Design of ion-implanted MOSFET's with very small physical dimensions journal October 1974
A Shared-Well Dual-Supply-Voltage 64-bit ALU journal March 2004
Negative Capacitance in Short-Channel FinFETs Externally Connected to an Epitaxial Ferroelectric Capacitor journal January 2016
Residue Number Systems: A New Paradigm to Datapath Optimization for Low-Power and High-Performance Digital Signal Processing Applications journal January 2015
DIVA: a reliable substrate for deep submicron microarchitecture design
  • Austin, T. M.
  • MICRO-32. 32nd Annual ACM/IEEE International Symposium on Microarchitecture, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture https://doi.org/10.1109/MICRO.1999.809458
conference January 1999
Razor: a low-power pipeline based on circuit-level timing speculation conference January 2003
Self-checked computation using residue arithmetic journal January 1966
Biresidue Error-Correcting Codes for Computer Arithmetic journal May 1970
Error Correction in Redundant Residue Number Systems journal January 1973
Error Detection and Correction by Product Codes in Residue Number Systems journal September 1974
Redundant residue number systems for error detection and correction in digital filters journal October 1980
Concurrent Error Detection in ALU's by Recomputing with Shifted Operands journal July 1982
Single Residue Error Correction in Residue Number Systems journal May 1983
RSA speedup with chinese remainder theorem immune against hardware fault cryptanalysis journal April 2003
A non-iterative multiple residue digit error detection and correction algorithm in RRNS journal February 2016
Multiple error detection and correction based on redundant residue number systems journal March 2008
Error Correction in Polynomial Remainder Codes With Non-Pairwise Coprime Moduli and Robust Chinese Remainder Theorem for Polynomials journal March 2015
Self-Checking Carry-Select Adder Design Based on Two-Rail Encoding journal December 2007
Error Detecting and Correcting Binary Codes for Arithmetic Operations journal September 1960
Carry checking/parity prediction adders and ALUs journal February 2003
Performance of Systematic RRNS Based Space-Time Block Codes with Probability-Aware Adaptive Demapping journal May 2013
Exploiting residue number system for power-efficient digital signal processing in embedded processors
  • Chokshi, Rooju; Berezowski, Krzysztof S.; Shrivastava, Aviral
  • Proceedings of the 2009 international conference on Compilers, architecture, and synthesis for embedded systems - CASES '09 https://doi.org/10.1145/1629395.1629401
conference January 2009
Chinese remaindering with errors conference January 1999
(Keynote) In Quest of a Fast, Low-Voltage Digital Switch journal April 2012
New Error Control Algorithms for Residue Number System Codes journal April 2016
Parallel Algorithms for Residue Scaling and Error Correction in Residue Arithmetic journal January 2013

Similar Records

Dynamic Undervolting to Improve Energy Efficiency on Multicore X86 CPUs
Journal Article · Mon Jun 22 20:00:00 EDT 2020 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1722960

Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Journal Article · Wed Dec 30 19:00:00 EST 2020 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1775093

Approaches for a reliable high-performance distributed-parallel storage system
Conference · Mon Dec 30 23:00:00 EST 1996 · OSTI ID:421392

Related Subjects