skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications

Abstract

Multi-dimensional MPI communications, where MPI communications have to be performed in each dimension of a Cartesian communicator, have been frequently used in many of today's high performance computing applications. While individual MPI collective communications for regular communicators with a one-dimensional linear-ranking of processes have been extensively studied and optimized, little optimizations have been performed for multi-dimensional MPI collective communications on multi-dimensional Cartesian topology. In this paper, we optimize multi-dimensional MPI collective communications for SMP and multi-core systems at the application level. We show that the default Cartesian topology built by the state-of-the-art MPI implementations produce sub-optimal performance for multi-dimensional MPI collective communications. We design optimal process-to-core mapping schemes for Cartesian communicators to minimize the total inter-node communications. The proposed technique improves the performance by up to 80% over the default Cartesian topology built by Cray's MPI implementation MPT 3.1.02 on the world's current second fastest supercomputer Jaguar at Oak Ridge National Laboratory.

Authors:
; ;
Publication Date:
Research Org.:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1567309
Resource Type:
Conference
Journal Name:
2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER)
Additional Journal Information:
Conference: 2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, September 24-28, 2012, Beijing, China
Country of Publication:
United States
Language:
English
Subject:
Computer Science

Citation Formats

Karlsson, Christer, Davies, Teresa, and Chen, Zizhong. Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications. United States: N. p., 2012. Web. doi:10.1109/CLUSTER.2012.47.
Karlsson, Christer, Davies, Teresa, & Chen, Zizhong. Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications. United States. doi:10.1109/CLUSTER.2012.47.
Karlsson, Christer, Davies, Teresa, and Chen, Zizhong. Thu . "Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications". United States. doi:10.1109/CLUSTER.2012.47.
@article{osti_1567309,
title = {Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications},
author = {Karlsson, Christer and Davies, Teresa and Chen, Zizhong},
abstractNote = {Multi-dimensional MPI communications, where MPI communications have to be performed in each dimension of a Cartesian communicator, have been frequently used in many of today's high performance computing applications. While individual MPI collective communications for regular communicators with a one-dimensional linear-ranking of processes have been extensively studied and optimized, little optimizations have been performed for multi-dimensional MPI collective communications on multi-dimensional Cartesian topology. In this paper, we optimize multi-dimensional MPI collective communications for SMP and multi-core systems at the application level. We show that the default Cartesian topology built by the state-of-the-art MPI implementations produce sub-optimal performance for multi-dimensional MPI collective communications. We design optimal process-to-core mapping schemes for Cartesian communicators to minimize the total inter-node communications. The proposed technique improves the performance by up to 80% over the default Cartesian topology built by Cray's MPI implementation MPT 3.1.02 on the world's current second fastest supercomputer Jaguar at Oak Ridge National Laboratory.},
doi = {10.1109/CLUSTER.2012.47},
journal = {2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER)},
number = ,
volume = ,
place = {United States},
year = {2012},
month = {10}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: