Accelerating list management for MPI.
The latency and throughput of MPI messages are critically important to a range of parallel scientific applications. In many modern networks, both of these performance characteristics are largely driven by the performance of a processor on the network interface. Because of the semantics of MPI, this embedded processor is forced to traverse a linked list of posted receives each time a message is received. As this list grows long, the latency of message reception grows and the throughput of MPI messages decreases. This paper presents a novel hardware feature to handle list management functions on a network interface. By moving functions such as list insertion, list traversal, and list deletion to the hardware unit, latencies are decreased by up to 20% in the zero length queue case with dramatic improvements in the presence of long queues. Similarly, the throughput is increased by up to 10% in the zero length queue case and by nearly 100% in the presence queues of 30 messages.
- Research Organization:
- Sandia National Laboratories
- Sponsoring Organization:
- USDOE
- DOE Contract Number:
- AC04-94AL85000
- OSTI ID:
- 967795
- Report Number(s):
- SAND2005-4564C
- Country of Publication:
- United States
- Language:
- English
Similar Records
Evaluating MPI resource usage summary statistics
Using Simulation to Examine the Effect of MPI Message Matching Costs on Application Performance