Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Topolgy Agnostic Hot-Spot Avoidance with InfiniBand

Journal Article · · Concurrency and Computation. Practice & Experience, 21(3):301-319
DOI:https://doi.org/10.1002/cpe.1359· OSTI ID:985587
InfiniBand has become a very popular interconnect, due to its advanced features and open standard. Large scale InfiniBand clusters are becoming very popular, as reflected by the TOP 500 supercomputer rankings. However, even with popular topologies like constant bi-section bandwidth Fat Tree, hot-spots may occur with InfiniBand, due to inappropriate configuration of network paths, presence of other jobs in the network and un-availability of adaptive routing. In this paper, we present a hot-spot avoidance layer (HSAL) for InfiniBand, which provides hot-spot avoidance using path bandwidth estimation and multi-pathing using LMC mechanism, without taking the network topology into account. We propose an adaptive striping policy with batch based striping and sorting approach, for efficient utilization of disjoint network paths. Integration of HSAL with MPI, the de facto programming model of clusters, shows promising results with collective communication primitives and MPI applications.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
985587
Report Number(s):
PNNL-SA-69491; KJ0402000
Journal Information:
Concurrency and Computation. Practice & Experience, 21(3):301-319, Journal Name: Concurrency and Computation. Practice & Experience, 21(3):301-319 Journal Issue: 3 Vol. 21
Country of Publication:
United States
Language:
English