Extending Moore's Law via Computationally Error Tolerant Computing.
Journal Article
·
· ACM Transactions on Architecture and Code Optimization
- Georgia Inst. of Technology, Atlanta, GA (United States)
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Dennard scaling has ended. Lowering the voltage supply (Vdd) to sub-volt levels causes intermittent losses in signal integrity, rendering further scaling (down) no longer acceptable as a means to lower the power required by a processor core. However, it is possible to correct the occasional errors caused due to lower Vdd in an efficient manner and effectively lower power. By deploying the right amount and kind of redundancy, we can strike a balance between overhead incurred in achieving reliability and energy savings realized by permitting lower Vdd. One promising approach is the Redundant Residue Number System (RRNS) representation. Unlike other error correcting codes, RRNS has the important property of being closed under addition, subtraction and multiplication, thus enabling computational error correction at a fraction of an overhead compared to conventional approaches. We use the RRNS scheme to design a Computationally-Redundant, Energy-Efficient core, including the microarchitecture, Instruction Set Architecture (ISA) and RRNS centered algorithms. Finally, from the simulation results, this RRNS system can reduce the energy-delay-product by about 3× for multiplication intensive workloads and by about 2× in general, when compared to a non-error-correcting binary core.
- Research Organization:
- Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- AC04-94AL85000
- OSTI ID:
- 1432788
- Report Number(s):
- SAND--2018-0257J; 659857
- Journal Information:
- ACM Transactions on Architecture and Code Optimization, Journal Name: ACM Transactions on Architecture and Code Optimization Journal Issue: 1 Vol. 15; ISSN 1544-3566
- Publisher:
- Association for Computing MachineryCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Dynamic Undervolting to Improve Energy Efficiency on Multicore X86 CPUs
Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Approaches for a reliable high-performance distributed-parallel storage system
Journal Article
·
Mon Jun 22 20:00:00 EDT 2020
· IEEE Transactions on Parallel and Distributed Systems
·
OSTI ID:1722960
Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Journal Article
·
Wed Dec 30 19:00:00 EST 2020
· IEEE Transactions on Parallel and Distributed Systems
·
OSTI ID:1775093
Approaches for a reliable high-performance distributed-parallel storage system
Conference
·
Mon Dec 30 23:00:00 EST 1996
·
OSTI ID:421392