| | |
Summary: Static Allocation of Multirail Networks
Salvador Coll, Eitan Frachtenberg, Fabrizio Petrini,
Adolfy Hoisie and Leonid Gurvits
CCS-3 Modeling, Algorithms, & Informatics Group
Computer & Computational Sciences Division
Los Alamos National Laboratory
{scoll,eitanf,fabrizio,hoisie,gurvits}@lanl.gov
Technical Report
Abstract
Using multiple independent networks (also known as rails) is an emerging tech-
nique to overcome bandwidth limitations and enhance fault-tolerance of current
high-performance clusters. This report presents the limitations and performance
of static rail-allocation approaches, where each rail is pre-assigned a direction for
communication. An analytical lower bound on the number of networks required
for rail allocation is shown. We present an extensive experimental comparison of
the behavior of various allocation schemes in terms of bandwidth and latency, com-
pared to static rail allocation. We also compare the ability of static and dynamic
rail-allocation mechanism to stripe messages over multiple rails. Scalability issues
of static and dynamic rail allocation are also compared. We find that not only static
|