A dynamic, unified design for dedicated message matching engines for collective and point-to-point communications
Abstract
The Message Passing Interface (MPI) libraries use message queues to guarantee correct message ordering between communicating processes. Message queues are in the critical path of MPI communications and thus, the performance of message queue operations can have significant impact on the performance of applications. Collective communications are widely used in MPI applications and they can have considerable impact on generating long message queues. In this paper, we propose a unified message matching mechanism that improves the message queue search time by distinguishing messages coming from point-to-point and collective communications and using a distinct message queue data structure for them. For collective operations, it dynamically profiles the impact of each collective call on message queues during the application runtime and uses this information to adapt the message queue data structure for each collective dynamically. Moreover, we use a partner/non-partner message queue data structure for the messages coming from point-to-point communications. The proposed approach can successfully reduce the queue search time while maintaining scalable memory consumption. The evaluation results show that we can obtain up to 5.5x runtime speedup for applications with long list traversals. Moreover, we can gain up to 15% and 94% queue search time improvement for all elements inmore »
- Authors:
-
- Queen's Univ., Kingston, ON (Canada)
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Publication Date:
- Research Org.:
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
- Sponsoring Org.:
- USDOE National Nuclear Security Administration (NNSA)
- OSTI Identifier:
- 1570290
- Alternate Identifier(s):
- OSTI ID: 1702563
- Report Number(s):
- SAND-2019-10114J
Journal ID: ISSN 0167-8191; 678886
- Grant/Contract Number:
- AC04-94AL85000; NA0003525
- Resource Type:
- Accepted Manuscript
- Journal Name:
- Parallel Computing
- Additional Journal Information:
- Journal Volume: 89; Journal Issue: C; Journal ID: ISSN 0167-8191
- Publisher:
- Elsevier
- Country of Publication:
- United States
- Language:
- English
- Subject:
- 97 MATHEMATICS AND COMPUTING; MPI; Message matching; Message queue; Collective communications; Point-to-point communications
Citation Formats
Ghazimirsaeed, S. Mahdieh, Grant, Ryan E., and Afsahi, Ahmad. A dynamic, unified design for dedicated message matching engines for collective and point-to-point communications. United States: N. p., 2019.
Web. doi:10.1016/j.parco.2019.102547.
Ghazimirsaeed, S. Mahdieh, Grant, Ryan E., & Afsahi, Ahmad. A dynamic, unified design for dedicated message matching engines for collective and point-to-point communications. United States. https://doi.org/10.1016/j.parco.2019.102547
Ghazimirsaeed, S. Mahdieh, Grant, Ryan E., and Afsahi, Ahmad. Thu .
"A dynamic, unified design for dedicated message matching engines for collective and point-to-point communications". United States. https://doi.org/10.1016/j.parco.2019.102547. https://www.osti.gov/servlets/purl/1570290.
@article{osti_1570290,
title = {A dynamic, unified design for dedicated message matching engines for collective and point-to-point communications},
author = {Ghazimirsaeed, S. Mahdieh and Grant, Ryan E. and Afsahi, Ahmad},
abstractNote = {The Message Passing Interface (MPI) libraries use message queues to guarantee correct message ordering between communicating processes. Message queues are in the critical path of MPI communications and thus, the performance of message queue operations can have significant impact on the performance of applications. Collective communications are widely used in MPI applications and they can have considerable impact on generating long message queues. In this paper, we propose a unified message matching mechanism that improves the message queue search time by distinguishing messages coming from point-to-point and collective communications and using a distinct message queue data structure for them. For collective operations, it dynamically profiles the impact of each collective call on message queues during the application runtime and uses this information to adapt the message queue data structure for each collective dynamically. Moreover, we use a partner/non-partner message queue data structure for the messages coming from point-to-point communications. The proposed approach can successfully reduce the queue search time while maintaining scalable memory consumption. The evaluation results show that we can obtain up to 5.5x runtime speedup for applications with long list traversals. Moreover, we can gain up to 15% and 94% queue search time improvement for all elements in applications with short and medium list traversals, respectively.},
doi = {10.1016/j.parco.2019.102547},
journal = {Parallel Computing},
number = C,
volume = 89,
place = {United States},
year = {Thu Sep 05 00:00:00 EDT 2019},
month = {Thu Sep 05 00:00:00 EDT 2019}
}
Web of Science