DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Method and apparatus for selective and power-aware memory error protection and memory management

Abstract

A method for providing selective memory error protection responsive to a predictable failure notification associated with at least one portion of a memory in a computing system includes: obtaining an active error correcting code (ECC) configuration corresponding to the portion of the memory; determining whether the active ECC configuration is sufficient to correct at least one error in the portion of the memory affected by the predictable failure notification; when the active ECC configuration is insufficient to correct the error, determining whether data corruption can be tolerated by an application running on the computing system; when data corruption cannot be tolerated by the application, determining whether a stronger ECC level is available and, if a stronger ECC level is available, increasing a strength of the active ECC configuration; and when data corruption can be tolerated, performing page reassignment and aggregation of non-critical data.

Inventors:
; ; ; ;
Issue Date:
Research Org.:
International Business Machines Corp., Armonk, NY (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1495044
Patent Number(s):
10141955
Application Number:
14/684,368
Assignee:
International Business Machines Corporation (Armonk, NY)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
H - ELECTRICITY H03 - BASIC ELECTRONIC CIRCUITRY H03M - CODING
DOE Contract Number:  
B599858
Resource Type:
Patent
Resource Relation:
Patent File Date: 2015 Apr 11
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Andrade Costa, Carlos H., Cher, Chen-Yong, Park, Yoonho, Rosenburg, Bryan S., and Ryu, Kyung D. Method and apparatus for selective and power-aware memory error protection and memory management. United States: N. p., 2018. Web.
Andrade Costa, Carlos H., Cher, Chen-Yong, Park, Yoonho, Rosenburg, Bryan S., & Ryu, Kyung D. Method and apparatus for selective and power-aware memory error protection and memory management. United States.
Andrade Costa, Carlos H., Cher, Chen-Yong, Park, Yoonho, Rosenburg, Bryan S., and Ryu, Kyung D. Tue . "Method and apparatus for selective and power-aware memory error protection and memory management". United States. https://www.osti.gov/servlets/purl/1495044.
@article{osti_1495044,
title = {Method and apparatus for selective and power-aware memory error protection and memory management},
author = {Andrade Costa, Carlos H. and Cher, Chen-Yong and Park, Yoonho and Rosenburg, Bryan S. and Ryu, Kyung D.},
abstractNote = {A method for providing selective memory error protection responsive to a predictable failure notification associated with at least one portion of a memory in a computing system includes: obtaining an active error correcting code (ECC) configuration corresponding to the portion of the memory; determining whether the active ECC configuration is sufficient to correct at least one error in the portion of the memory affected by the predictable failure notification; when the active ECC configuration is insufficient to correct the error, determining whether data corruption can be tolerated by an application running on the computing system; when data corruption cannot be tolerated by the application, determining whether a stronger ECC level is available and, if a stronger ECC level is available, increasing a strength of the active ECC configuration; and when data corruption can be tolerated, performing page reassignment and aggregation of non-critical data.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Nov 27 00:00:00 EST 2018},
month = {Tue Nov 27 00:00:00 EST 2018}
}

Works referenced in this record:

Measurement-based analysis of fault and error sensitivities of dynamic memory
conference, June 2010


Exploring event correlation for failure prediction in coalitions of clusters
conference, January 2007


System and method for exchanging data
patent, June 2004


Flikker: saving DRAM refresh-power through critical data partitioning
conference, January 2011

  • Liu, Song; Pattabiraman, Karthik; Moscibroda, Thomas
  • Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems - ASPLOS '11
  • https://doi.org/10.1145/1950365.1950391

Energy-efficient cache design using variable-strength error-correcting codes
conference, January 2011


SECRET: Selective error correction for refresh energy reduction in DRAMs
conference, September 2012

  • Lin, Chung-Hsiang; Shen, De-Yu; Chen, Yi-Jung
  • 2012 IEEE 30th International Conference on Computer Design (ICCD 2012), 2012 IEEE 30th International Conference on Computer Design (ICCD)
  • https://doi.org/10.1109/ICCD.2012.6378619

SafeMem: Exploiting ECC-Memory for Detecting Memory Leaks and Memory Corruption During Production Runs
conference, January 2005


MAGE: Adaptive Granularity and ECC for resilient and power efficient memory systems
conference, November 2012

  • Li, Sheng; Yoon, Doe Hyun; Chen, Ke
  • 2012 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis
  • https://doi.org/10.1109/SC.2012.73