Reliability processing of remote direct memory access
Abstract
Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.
- Inventors:
- Issue Date:
- Research Org.:
- International Business Machines Corp., Armonk, NY (United States)
- Sponsoring Org.:
- USDOE
- OSTI Identifier:
- 1805664
- Patent Number(s):
- 10958588
- Application Number:
- 15/888,228
- Assignee:
- International Business Machines Corporation (Armonk, NY)
- DOE Contract Number:
- B554331
- Resource Type:
- Patent
- Resource Relation:
- Patent File Date: 02/05/2018
- Country of Publication:
- United States
- Language:
- English
Citation Formats
Kumar, Sameer, Heidelberger, Philip, Sugawara, Yutaka, Chen, Dong, and Senger, Robert M. Reliability processing of remote direct memory access. United States: N. p., 2021.
Web.
Kumar, Sameer, Heidelberger, Philip, Sugawara, Yutaka, Chen, Dong, & Senger, Robert M. Reliability processing of remote direct memory access. United States.
Kumar, Sameer, Heidelberger, Philip, Sugawara, Yutaka, Chen, Dong, and Senger, Robert M. Tue .
"Reliability processing of remote direct memory access". United States. https://www.osti.gov/servlets/purl/1805664.
@article{osti_1805664,
title = {Reliability processing of remote direct memory access},
author = {Kumar, Sameer and Heidelberger, Philip and Sugawara, Yutaka and Chen, Dong and Senger, Robert M.},
abstractNote = {Methods and systems for monitoring remote transmissions of messages among a plurality of nodes are described. A processing element in a first node may allocate a sequence number to a request to read and/or update data in a second node. The processing element may be different from main processors of the first node. The processing element may send the message and the sequence number to the second node. The processing element may modify a status of the sequence number to an active state, indicating a transmission of the message is pending. The processing element may, in response to a response from the second node, modify the status of the sequence number to an inactive state, indicating a completed transmission of the message. The processing element may, in response to no response from the second node within a time period, resend the message and the sequence number to the second node.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2021},
month = {3}
}