skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

Abstract

On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phase of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership classmore » supercomputer at Oak Ridge National Laboratory.« less

Authors:
 [1];  [1];  [1];  [1]
  1. Oak Ridge National Laboratory, Oak Ridge, TN, USA
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1567437
Resource Type:
Conference
Resource Relation:
Conference: ICPE '16 Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; Computer Science

Citation Formats

Sreepathi, Sarat, D'Azevedo, Ed, Philip, Bobby, and Worley, Patrick. Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers. United States: N. p., 2016. Web. doi:10.1145/2851553.2851575.
Sreepathi, Sarat, D'Azevedo, Ed, Philip, Bobby, & Worley, Patrick. Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers. United States. https://doi.org/10.1145/2851553.2851575
Sreepathi, Sarat, D'Azevedo, Ed, Philip, Bobby, and Worley, Patrick. Fri . "Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers". United States. https://doi.org/10.1145/2851553.2851575.
@article{osti_1567437,
title = {Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers},
author = {Sreepathi, Sarat and D'Azevedo, Ed and Philip, Bobby and Worley, Patrick},
abstractNote = {On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phase of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.},
doi = {10.1145/2851553.2851575},
url = {https://www.osti.gov/biblio/1567437}, journal = {},
number = ,
volume = ,
place = {United States},
year = {2016},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:

Works referenced in this record:

Optimizing task layout on the Blue Gene/L supercomputer
journal, March 2005


Reducing the bandwidth of sparse symmetric matrices
conference, January 1969


Task allocation onto a hypercube by recursive mincut bipartitioning
conference, January 1988

  • Ercal, F.; Ramanujam, J.; Sadayappan, P.
  • Proceedings of the third conference on Hypercube concurrent computers and applications Architecture, software, computer systems, and general issues -
  • https://doi.org/10.1145/62297.62323

Nested Dissection of a Regular Finite Element Mesh
journal, April 1973


Violin Plots: A Box Plot-Density Trace Synergism
journal, May 1998


Generic topology mapping strategies for large-scale parallel architectures
conference, January 2011


Hierarchical clustering schemes
journal, September 1967


Asynchronous Fast Adaptive Composite-Grid Methods: Numerical Results
journal, January 2003


Asynchronous Fast Adaptive Composite-Grid Methods for Elliptic Problems: Theoretical Foundations
journal, January 2004


The fast adaptive composite grid (FAC) method for elliptic equations
journal, May 1986


Dynamic implicit 3D adaptive mesh refinement for non-equilibrium radiation diffusion
journal, April 2014


Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications: Task Reordering to Improve Parallel Performance
journal, April 2015


Parallel static and dynamic multi-constraint graph partitioning
journal, January 2002


Improving communication performance in dense linear algebra via topology aware collectives
conference, January 2011

  • Solomonik, Edgar; Bhatele, Abhinav; Demmel, James
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11
  • https://doi.org/10.1145/2063384.2063487

Application Characterization Using Oxbow Toolkit and PADS Infrastructure
conference, November 2014