DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Designing and Evaluating Redundancy-Based Soft-Error Masking on a Continuum of Energy versus Robustness

Abstract

Near-threshold computing is an effective strategy to reduce the power dissipation of deeply-scaled CMOS logic circuits. However, near-threshold strategies exacerbate the impact of delay variations on device performance and increase the susceptibility to soft errors due to narrow voltage margins. The objective of this work is to develop and assess design approaches that leverage tradeoffs between performance and the resilience of fault masking coverage for various soft-error mitigation techniques. The primary insight from this work is identification of redundancy-based hardening techniques that can deliver increased benefits in terms of the fault coverage energy ratio (FCER) for the leveraged tradeoffs within iso-energy constraints at near-threshold voltage (NTV). Simulation results demonstrate that temporal redundancy approaches offer favorable tradeoffs in terms of FCER. They exhibit reduced impact on performance variations and achieve extensive soft fault masking, therefore improving the system robustness within acceptable delay constraints. Meanwhile, it is shown that a hybrid redundancy approach can be used to protect a low-power system to maintain throughput while tolerating soft errors. We demonstrate how the FCER metric can be used as an optimization parameter to guide circuit synthesis to meet performance and robustness goals. Lastly, the impact of design diversity on spatial and hybrid redundancymore » at NTV is assessed in terms of FCER and delay variation to form overall recommendations regarding soft-error mitigation at NTV.« less

Authors:
ORCiD logo [1]; ORCiD logo [2]; ORCiD logo [1]
  1. Univ. of Central Florida, Orlando, FL (United States)
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1513408
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Transactions on Sustainable Computing
Additional Journal Information:
Journal Volume: 3; Journal Issue: 3; Journal ID: ISSN 2377-3790
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; delay variation; design diversity; energy-efficient computing; fault resilience; near-threshold voltage (NTV); redundancy-based mitigation techniques; reliability; soft-error rate (SER)

Citation Formats

Alghareb, Faris Salih, Ashraf, Rizwan A., and DeMara, Ronald F. Designing and Evaluating Redundancy-Based Soft-Error Masking on a Continuum of Energy versus Robustness. United States: N. p., 2017. Web. doi:10.1109/TSUSC.2017.2764857.
Alghareb, Faris Salih, Ashraf, Rizwan A., & DeMara, Ronald F. Designing and Evaluating Redundancy-Based Soft-Error Masking on a Continuum of Energy versus Robustness. United States. https://doi.org/10.1109/TSUSC.2017.2764857
Alghareb, Faris Salih, Ashraf, Rizwan A., and DeMara, Ronald F. Thu . "Designing and Evaluating Redundancy-Based Soft-Error Masking on a Continuum of Energy versus Robustness". United States. https://doi.org/10.1109/TSUSC.2017.2764857. https://www.osti.gov/servlets/purl/1513408.
@article{osti_1513408,
title = {Designing and Evaluating Redundancy-Based Soft-Error Masking on a Continuum of Energy versus Robustness},
author = {Alghareb, Faris Salih and Ashraf, Rizwan A. and DeMara, Ronald F.},
abstractNote = {Near-threshold computing is an effective strategy to reduce the power dissipation of deeply-scaled CMOS logic circuits. However, near-threshold strategies exacerbate the impact of delay variations on device performance and increase the susceptibility to soft errors due to narrow voltage margins. The objective of this work is to develop and assess design approaches that leverage tradeoffs between performance and the resilience of fault masking coverage for various soft-error mitigation techniques. The primary insight from this work is identification of redundancy-based hardening techniques that can deliver increased benefits in terms of the fault coverage energy ratio (FCER) for the leveraged tradeoffs within iso-energy constraints at near-threshold voltage (NTV). Simulation results demonstrate that temporal redundancy approaches offer favorable tradeoffs in terms of FCER. They exhibit reduced impact on performance variations and achieve extensive soft fault masking, therefore improving the system robustness within acceptable delay constraints. Meanwhile, it is shown that a hybrid redundancy approach can be used to protect a low-power system to maintain throughput while tolerating soft errors. We demonstrate how the FCER metric can be used as an optimization parameter to guide circuit synthesis to meet performance and robustness goals. Lastly, the impact of design diversity on spatial and hybrid redundancy at NTV is assessed in terms of FCER and delay variation to form overall recommendations regarding soft-error mitigation at NTV.},
doi = {10.1109/TSUSC.2017.2764857},
journal = {IEEE Transactions on Sustainable Computing},
number = 3,
volume = 3,
place = {United States},
year = {2017},
month = {10}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Figures / Tables:

TABLE 1 TABLE 1: Comparison between redundancy-based soft error mitigation approaches, where by each (✓, –) indicates relative (strength, limitation/weakness).

Save / Share: