skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

This content will become publicly available on February 6, 2020

Title: Tail queues: A multi-threaded matching architecture

Abstract

As we approach exascale, computational parallelism will have to drastically increase in order to meet throughput targets. Many–core architectures have exacerbated this problem by trading reduced clock speeds, core complexity, and computation throughput for increasing parallelism. This presents two major challenges for communication libraries such as MPI: the library must leverage the performance advantages of thread level parallelism and avoid the scalability problems associated with increasing the number of processes to that scale. Hybrid programming models, such as MPI+X, have been proposed to address these challenges. MPI THREAD MULTIPLE is MPI's thread safe mode. While there has been work to optimize it, it largely remains non–performant in most implementations. While current applications avoid MPI multithreading due to performance concerns, it is expected to be utilized in future applications. One of the major synchronous data structures required by MPI is the matching engine. In this paper, we present a parallel matching algorithm that can improve MPI matching for multithreaded applications. We then perform a feasibility study to demonstrate the performance benefit of the technique.

Authors:
ORCiD logo [1];  [1];  [1];  [2]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Univ. of New Mexico, Albuquerque, NM (United States)
  2. Univ. of New Mexico, Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1496973
Report Number(s):
SAND-2019-1466J
Journal ID: ISSN 1532-0626; 672473
Grant/Contract Number:  
AC04-94AL85000
Resource Type:
Accepted Manuscript
Journal Name:
Concurrency and Computation. Practice and Experience
Additional Journal Information:
Journal Name: Concurrency and Computation. Practice and Experience; Journal ID: ISSN 1532-0626
Publisher:
Wiley
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; high performance computing; many core; MPI; networks

Citation Formats

Dosanjh, Matthew G. F., Grant, Ryan E., Schonbein, Whit, and Bridges, Patrick G.. Tail queues: A multi-threaded matching architecture. United States: N. p., 2019. Web. doi:10.1002/cpe.5158.
Dosanjh, Matthew G. F., Grant, Ryan E., Schonbein, Whit, & Bridges, Patrick G.. Tail queues: A multi-threaded matching architecture. United States. doi:10.1002/cpe.5158.
Dosanjh, Matthew G. F., Grant, Ryan E., Schonbein, Whit, and Bridges, Patrick G.. Wed . "Tail queues: A multi-threaded matching architecture". United States. doi:10.1002/cpe.5158.
@article{osti_1496973,
title = {Tail queues: A multi-threaded matching architecture},
author = {Dosanjh, Matthew G. F. and Grant, Ryan E. and Schonbein, Whit and Bridges, Patrick G.},
abstractNote = {As we approach exascale, computational parallelism will have to drastically increase in order to meet throughput targets. Many–core architectures have exacerbated this problem by trading reduced clock speeds, core complexity, and computation throughput for increasing parallelism. This presents two major challenges for communication libraries such as MPI: the library must leverage the performance advantages of thread level parallelism and avoid the scalability problems associated with increasing the number of processes to that scale. Hybrid programming models, such as MPI+X, have been proposed to address these challenges. MPI THREAD MULTIPLE is MPI's thread safe mode. While there has been work to optimize it, it largely remains non–performant in most implementations. While current applications avoid MPI multithreading due to performance concerns, it is expected to be utilized in future applications. One of the major synchronous data structures required by MPI is the matching engine. In this paper, we present a parallel matching algorithm that can improve MPI matching for multithreaded applications. We then perform a feasibility study to demonstrate the performance benefit of the technique.},
doi = {10.1002/cpe.5158},
journal = {Concurrency and Computation. Practice and Experience},
number = ,
volume = ,
place = {United States},
year = {2019},
month = {2}
}

Journal Article:
Free Publicly Available Full Text
This content will become publicly available on February 6, 2020
Publisher's Version of Record

Save / Share: