Error correcting code with chip kill capability and power saving enhancement
Abstract
A method and system are disclosed for detecting memory chip failure in a computer memory system. The method comprises the steps of accessing user data from a set of user data chips, and testing the user data for errors using data from a set of system data chips. This testing is done by generating a sequence of check symbols from the user data, grouping the user data into a sequence of data symbols, and computing a specified sequence of syndromes. If all the syndromes are zero, the user data has no errors. If one of the syndromes is non-zero, then a set of discriminator expressions are computed, and used to determine whether a single or double symbol error has occurred. In the preferred embodiment, less than two full system data chips are used for testing and correcting the user data.
- Inventors:
-
- Mount Kisco, NY
- Croton On Husdon, NY
- Yorktown Heights, NY
- Rochester, MN
- Brewster, NY
- Scarsdale, NY
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1023908
- Patent Number(s):
- 8010875
- Application Number:
- 11/768,559
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 2007 Jun 26
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING
Citation Formats
Gara, Alan G, Chen, Dong, Coteus, Paul W, Flynn, William T, Marcella, James A, Takken, Todd, Trager, Barry M, and Winograd, Shmuel. Error correcting code with chip kill capability and power saving enhancement. United States: N. p., 2011.
Web.
Gara, Alan G, Chen, Dong, Coteus, Paul W, Flynn, William T, Marcella, James A, Takken, Todd, Trager, Barry M, & Winograd, Shmuel. Error correcting code with chip kill capability and power saving enhancement. United States.
Gara, Alan G, Chen, Dong, Coteus, Paul W, Flynn, William T, Marcella, James A, Takken, Todd, Trager, Barry M, and Winograd, Shmuel. Tue .
"Error correcting code with chip kill capability and power saving enhancement". United States. https://www.osti.gov/servlets/purl/1023908.
@article{osti_1023908,
title = {Error correcting code with chip kill capability and power saving enhancement},
author = {Gara, Alan G and Chen, Dong and Coteus, Paul W and Flynn, William T and Marcella, James A and Takken, Todd and Trager, Barry M and Winograd, Shmuel},
abstractNote = {A method and system are disclosed for detecting memory chip failure in a computer memory system. The method comprises the steps of accessing user data from a set of user data chips, and testing the user data for errors using data from a set of system data chips. This testing is done by generating a sequence of check symbols from the user data, grouping the user data into a sequence of data symbols, and computing a specified sequence of syndromes. If all the syndromes are zero, the user data has no errors. If one of the syndromes is non-zero, then a set of discriminator expressions are computed, and used to determine whether a single or double symbol error has occurred. In the preferred embodiment, less than two full system data chips are used for testing and correcting the user data.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2011},
month = {8}
}
Works referenced in this record:
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
journal, August 2005
- Pande, P. P.; Grecu, C.; Jones, M.
- IEEE Transactions on Computers, Vol. 54, Issue 8
Performance evaluation of adaptive MPI
conference, January 2006
- Huang, Chao; Zheng, Gengbin; Kalé, Laxmikant
- Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming - PPoPP '06
Directory-based cache coherence in large-scale multiprocessors
journal, June 1990
- Chaiken, D.; Fields, C.; Kurihara, K.
- Computer, Vol. 23, Issue 6
Synchronization, coherence, and event ordering in multiprocessors
journal, February 1988
- Dubois, M.; Scheurich, C.; Briggs, F. A.
- Computer, Vol. 21, Issue 2
Overview of the Blue Gene/L system architecture
journal, March 2005
- Gara, A.; Blumrich, M. A.; Chen, D.
- IBM Journal of Research and Development, Vol. 49, Issue 2.3
Optimization of MPI collective communication on BlueGene/L systems
conference, January 2005
- Almási, George; Heidelberger, Philip; Archer, Charles J.
- Proceedings of the 19th annual international conference on Supercomputing - ICS '05
Intel 870: a building block for cost-effective, scalable servers
journal, March 2002
- Briggs, F.; Cekleov, M.; Creta, K.
- IEEE Micro, Vol. 22, Issue 2
Blue Gene/L advanced diagnostics environment
journal, March 2005
- Giampapa, M. E.; Bellofatto, R.; Blumrich, M. A.
- IBM Journal of Research and Development, Vol. 49, Issue 2.3