Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

USING MULTIRAIL NETWORKS IN HIGH-PERFORMANCE CLUSTERS

Conference ·

Using multiple independent networks (also known as rails) is an emerging technique to overcome bandwidth limitations and enhance fault tolerance of current high-performance clusters. We present an extensive experimental comparison of the behavior of various allocation schemes in terms of bandwidth and latency. We show that striping messages over multiple rails can substantially reduce network latency, depending on average message size, network load, and allocation scheme. The compared methods include a basic round-robin rail allocation, a local-dynamic allocation based on local knowledge, and a dynamic rail allocation that reserves both communication endpoints of a message before sending it. The last method is shown to perform better than the others at higher loads: up to 49% better than local-knowledge allocation and 37% better than the round-robin allocation. This allocation scheme also shows lower latency and it saturates on higher loads (for messages large enough). Most importantly, this proposed allocation scheme scales well with the number of rails and message sizes. In addition we propose a hybrid algorithm that combines the benefits of the local-dynamic for short messages with those of the dynamic algorithm for large messages. Keywords: Communication Protocols, High-Performance Interconnection Networks, Performance Evaluation, Routing, Communication Libraries, Parallel Architectures.

Research Organization:
Los Alamos National Laboratory
Sponsoring Organization:
DOE
OSTI ID:
975661
Report Number(s):
LA-UR-01-4202
Country of Publication:
United States
Language:
English

Similar Records

USING MULTITAIL NETWORKS IN HIGH PERFORMANCE CLUSTERS
Conference · Wed Feb 28 23:00:00 EST 2001 · OSTI ID:776953

A framework for adaptive routing in multicomputer networks
Thesis/Dissertation · Sat Dec 31 23:00:00 EST 1988 · OSTI ID:6089781

Simulation studies of round robin contention in a prioritized CSMA broadcast network. [Dynamic assignment]
Conference · Fri Sep 01 00:00:00 EDT 1978 · OSTI ID:6461724