Detection of shared memory faults in a computing job
Abstract
Technology for determining whether an inter-process type message has been successfully sent from a first process to a second process running on a single computer with a single processor(s) set. A variable (for example, a bit value) is used to indicate whether the inter-process message has been communicated between the processes. A timer and a predetermined timeout threshold are used to determine if the inter-process message has been pending for too long without being successfully communicated.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1892528
- Patent Number(s):
- 11221906
- Application Number:
- 16/739,613
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- Patent Classifications (CPCs):
-
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
- DOE Contract Number:
- B604134
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 01/10/2020
- Country of Publication:
- United States
- Language:
- English
Citation Formats
LePera, William P., Sharkawi, Sameh Sherif, and Lauria, Austen William. Detection of shared memory faults in a computing job. United States: N. p., 2022.
Web.
LePera, William P., Sharkawi, Sameh Sherif, & Lauria, Austen William. Detection of shared memory faults in a computing job. United States.
LePera, William P., Sharkawi, Sameh Sherif, and Lauria, Austen William. Tue .
"Detection of shared memory faults in a computing job". United States. https://www.osti.gov/servlets/purl/1892528.
@article{osti_1892528,
title = {Detection of shared memory faults in a computing job},
author = {LePera, William P. and Sharkawi, Sameh Sherif and Lauria, Austen William},
abstractNote = {Technology for determining whether an inter-process type message has been successfully sent from a first process to a second process running on a single computer with a single processor(s) set. A variable (for example, a bit value) is used to indicate whether the inter-process message has been communicated between the processes. A timer and a predetermined timeout threshold are used to determine if the inter-process message has been pending for too long without being successfully communicated.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2022},
month = {1}
}
Works referenced in this record:
Cluster management system and method
patent-application, May 2013
- Hu, Liangjun; Quan, Rui
- US Patent Application 13/811371; 20130139178
Communication channel failover in a high performance computing (HPC) network
patent, May 2015
- Arroyo, Jesse P.; Bauman, Ellen M.; Schimke, Timothy J.
- US Patent Document 9,037,898
Remote service discovery and inter-process communication
patent-application, September 2019
- Chivetta, Anthony J.; Auricchio, Joseph R.; Pistol, Ion Valentin
- US Patent Application 16/352502; 20190286598
System and Method for Securely Connecting to a Peripheral Device
patent-application, August 2018
- Litichever, Gil; Gutentag, Oded; Zvuluny, Eyal
- US Patent Application 15/750528; 20180225230
System, method, and computer program product for improving memory systems
patent, August 2016
- Smith, Michael
- US Patent Document 9,432,298
Position parameterized recursive network architecture with topological addressing
patent, November 2018
- Day, John D.
- US Patent Document 10,135,689
System and Method for Improving Internet Communication by Using Intermediate Nodes
patent-application, March 2015
- Shribman, Derry; Vilenski, Ofer
- US Patent Application 14/468836; 20150067819
Automated system-level failure and recovery
patent, December 2018
- Brown, Michael Emery; Ballard, Lee E.; Cohoon, Stephen M.
- US Patent Document 10,146,653
Compressed headers for encapsulated real-time communications
patent-application, September 2016
- Herrero, Rolando; Katz, Henry
- US Patent Application 14/637550; 20160261558