skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective

Abstract

Large scale InfiniBand clusters are becoming increasingly popular, as reflected by the TOP 500 Supercomputer rankings. At the same time, fat tree has become a popular interconnection topology for these clusters, since it allows multiple paths to be available in between a pair of nodes. However, even with fat tree, hot-spots may occur in the network depending upon the route configuration between end nodes and communication pattern(s) in the application. To make matters worse, the deterministic routing nature of InfiniBand limits the application from effective use of multiple paths transparently and avoid the hot-spots in the network. Simulation based studies for switches and adapters to implement congestion control have been proposed in the literature. However, these studies have focused on providing congestion control for the communication path, and not on utilizing multiple paths in the network for hot-spot avoidance. In this paper, we design an MPI functionality, which provides hot-spot avoidance for different communications, without a priori knowledge of the pattern. We leverage LMC (LID Mask Count) mechanism of InfiniBand to create multiple paths in the network and present the design issues (scheduling policies, selecting number of paths, scalability aspects) of our design. We implement our design and evaluate itmore » with Pallas collective communication and MPI applications. On an InfiniBand cluster with 48 processes, collective operations like MPI All-to-all Personalized and MPI Reduce Scatter show an improvement of 27% and 19% respectively. Our evaluation with MPI applications like NAS Parallel Benchmarks and PSTSWM on 64 processes shows significant improvement in execution time with this functionality.« less

Authors:
; ; ; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
908380
Report Number(s):
UCRL-CONF-228725
TRN: US200722%%617
DOE Contract Number:
W-7405-ENG-48
Resource Type:
Conference
Resource Relation:
Conference: Presented at: CCGrid 07 - Seventh IEEE International Symposium on Cluster Computing and the Grid, Rio de Janeiro, Brazil, May 14 - May 17, 2007
Country of Publication:
United States
Language:
English
Subject:
99 GENERAL AND MISCELLANEOUS//MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE; AVOIDANCE; BENCHMARKS; COMMUNICATIONS; CONFIGURATION; DESIGN; EVALUATION; HOT SPOTS; ROUTING; SIMULATION; SUPERCOMPUTERS; SWITCHES; TOPOLOGY

Citation Formats

Vishnu, A, Koop, M, Moody, A, Mamidala, A R, Narravula, S, and Panda, D K. Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective. United States: N. p., 2007. Web.
Vishnu, A, Koop, M, Moody, A, Mamidala, A R, Narravula, S, & Panda, D K. Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective. United States.
Vishnu, A, Koop, M, Moody, A, Mamidala, A R, Narravula, S, and Panda, D K. Tue . "Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective". United States. doi:. https://www.osti.gov/servlets/purl/908380.
@article{osti_908380,
title = {Hot-Spot Avoidance With Multi-Pathing Over Infiniband: An MPI Perspective},
author = {Vishnu, A and Koop, M and Moody, A and Mamidala, A R and Narravula, S and Panda, D K},
abstractNote = {Large scale InfiniBand clusters are becoming increasingly popular, as reflected by the TOP 500 Supercomputer rankings. At the same time, fat tree has become a popular interconnection topology for these clusters, since it allows multiple paths to be available in between a pair of nodes. However, even with fat tree, hot-spots may occur in the network depending upon the route configuration between end nodes and communication pattern(s) in the application. To make matters worse, the deterministic routing nature of InfiniBand limits the application from effective use of multiple paths transparently and avoid the hot-spots in the network. Simulation based studies for switches and adapters to implement congestion control have been proposed in the literature. However, these studies have focused on providing congestion control for the communication path, and not on utilizing multiple paths in the network for hot-spot avoidance. In this paper, we design an MPI functionality, which provides hot-spot avoidance for different communications, without a priori knowledge of the pattern. We leverage LMC (LID Mask Count) mechanism of InfiniBand to create multiple paths in the network and present the design issues (scheduling policies, selecting number of paths, scalability aspects) of our design. We implement our design and evaluate it with Pallas collective communication and MPI applications. On an InfiniBand cluster with 48 processes, collective operations like MPI All-to-all Personalized and MPI Reduce Scatter show an improvement of 27% and 19% respectively. Our evaluation with MPI applications like NAS Parallel Benchmarks and PSTSWM on 64 processes shows significant improvement in execution time with this functionality.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Mar 06 00:00:00 EST 2007},
month = {Tue Mar 06 00:00:00 EST 2007}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: