skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Failure detection in high-performance clusters and computers using chaotic map computations

Abstract

A programmable media includes a processing unit capable of independent operation in a machine that is capable of executing 10.sup.18 floating point operations per second. The processing unit is in communication with a memory element and an interconnect that couples computing nodes. The programmable media includes a logical unit configured to execute arithmetic functions, comparative functions, and/or logical functions. The processing unit is configured to detect computing component failures, memory element failures and/or interconnect failures by executing programming threads that generate one or more chaotic map trajectories. The central processing unit or graphical processing unit is configured to detect a computing component failure, memory element failure and/or an interconnect failure through an automated comparison of signal trajectories generated by the chaotic maps.

Inventors:
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1213445
Patent Number(s):
9,122,603
Application Number:
13/919,601
Assignee:
UT-Battelle, LLC (Oak Ridge, TN)
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Patent
Resource Relation:
Patent File Date: 2013 Jun 17
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING

Citation Formats

Rao, Nageswara S. Failure detection in high-performance clusters and computers using chaotic map computations. United States: N. p., 2015. Web.
Rao, Nageswara S. Failure detection in high-performance clusters and computers using chaotic map computations. United States.
Rao, Nageswara S. Tue . "Failure detection in high-performance clusters and computers using chaotic map computations". United States. https://www.osti.gov/servlets/purl/1213445.
@article{osti_1213445,
title = {Failure detection in high-performance clusters and computers using chaotic map computations},
author = {Rao, Nageswara S.},
abstractNote = {A programmable media includes a processing unit capable of independent operation in a machine that is capable of executing 10.sup.18 floating point operations per second. The processing unit is in communication with a memory element and an interconnect that couples computing nodes. The programmable media includes a logical unit configured to execute arithmetic functions, comparative functions, and/or logical functions. The processing unit is configured to detect computing component failures, memory element failures and/or interconnect failures by executing programming threads that generate one or more chaotic map trajectories. The central processing unit or graphical processing unit is configured to detect a computing component failure, memory element failure and/or an interconnect failure through an automated comparison of signal trajectories generated by the chaotic maps.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2015},
month = {9}
}

Patent:

Save / Share:

Works referenced in this record:

Chaos: An Introduction to Dynamical Systems
journal, November 1997

  • Alligood, Kathleen T.; Sauer, Tim D.; Yorke, James A.
  • Physics Today, Vol. 50, Issue 11
  • DOI: 10.1063/1.882006

Basic concepts and taxonomy of dependable and secure computing
journal, January 2004

  • Avizienis, A.; Laprie, J. -C.; Randell, B.
  • IEEE Transactions on Dependable and Secure Computing, Vol. 1, Issue 1
  • DOI: 10.1109/TDSC.2004.2

Designing programs that check their work
journal, January 1995


Toward Exascale Resilience
journal, September 2009

  • Cappello, Franck; Geist, Al; Gropp, Bill
  • The International Journal of High Performance Computing Applications, Vol. 23, Issue 4
  • DOI: 10.1177/1094342009347767

The International Exascale Software Project roadmap
journal, January 2011

  • Dongarra, Jack; Beckman, Pete; Moore, Terry
  • The International Journal of High Performance Computing Applications, Vol. 25, Issue 1
  • DOI: 10.1177/1094342010391989

Quasiperiodic Route to Chaotic Dynamics of Internet Transport Protocols
journal, May 2005


Chaos: A tutorial for engineers
journal, January 1987


Computational complexity issues in operative diagnosis of graph-based systems
journal, April 1993

  • Rao, N. S. V.
  • IEEE Transactions on Computers, Vol. 42, Issue 4
  • DOI: 10.1109/12.214691

On polynomial-time testable combinational circuits
journal, January 1994

  • Rao, N. S. V.; Toida, S.
  • IEEE Transactions on Computers, Vol. 43, Issue 11
  • DOI: 10.1109/12.324562

Fail-stop processors: an approach to designing fault-tolerant computing systems
journal, August 1983

  • Schlichting, Richard D.; Schneider, Fred B.
  • ACM Transactions on Computer Systems, Vol. 1, Issue 3
  • DOI: 10.1145/357369.357371