skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

Conference ·

On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phase of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.

Research Organization:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC)
OSTI ID:
1567437
Resource Relation:
Conference: ICPE '16 Proceedings of the 7th ACM/SPEC on International Conference on Performance Engineering
Country of Publication:
United States
Language:
English

References (16)

Optimizing task layout on the Blue Gene/L supercomputer journal March 2005
Reducing the bandwidth of sparse symmetric matrices conference January 1969
Task allocation onto a hypercube by recursive mincut bipartitioning
  • Ercal, F.; Ramanujam, J.; Sadayappan, P.
  • Proceedings of the third conference on Hypercube concurrent computers and applications Architecture, software, computer systems, and general issues - https://doi.org/10.1145/62297.62323
conference January 1988
Nested Dissection of a Regular Finite Element Mesh journal April 1973
Violin Plots: A Box Plot-Density Trace Synergism journal May 1998
Generic topology mapping strategies for large-scale parallel architectures conference January 2011
Hierarchical clustering schemes journal September 1967
Asynchronous Fast Adaptive Composite-Grid Methods: Numerical Results journal January 2003
Asynchronous Fast Adaptive Composite-Grid Methods for Elliptic Problems: Theoretical Foundations journal January 2004
Asynchronous multilevel adaptive methods for solving partial differential equations on multiprocessors: Performance results journal November 1989
The fast adaptive composite grid (FAC) method for elliptic equations journal May 1986
Dynamic implicit 3D adaptive mesh refinement for non-equilibrium radiation diffusion journal April 2014
Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications: Task Reordering to Improve Parallel Performance journal April 2015
Parallel static and dynamic multi-constraint graph partitioning journal January 2002
Improving communication performance in dense linear algebra via topology aware collectives
  • Solomonik, Edgar; Bhatele, Abhinav; Demmel, James
  • Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on - SC '11 https://doi.org/10.1145/2063384.2063487
conference January 2011
Application Characterization Using Oxbow Toolkit and PADS Infrastructure conference November 2014