| | |
Summary: COLLECTIVE COMMUNICATION PATTERNS
ON THE QUADRICS NETWORK
Salvador Coll, José Duato, Francisco J. Mora
Technical University of Valencia, Valencia, Spain
scoll@eln.upv.es, jduato@gap.upv.es, fjmora@eln.upv.es
Fabrizio Petrini, Adolfy Hoisie
Los Alamos National Laboratory, Los Alamos, NM
fabrizio@lanl.gov, hoisie@lanl.gov
Abstract The efficient implementation of collective communication is a key factor to pro-
vide good performance and scalability of communication patterns that involve
global data movement and global control. Moreover, this is essential to enhance
the fault-tolerance of a parallel computer. For instance, to check the status of
the nodes, perform some distributed algorithm to balance the load, synchronize
the local clocks or do performance monitoring. For these reasons the support for
multicast communications can improve the performance and resource utilization
of a parallel computer.
The Quadrics interconnect (QsNET), which is being used in some of the
largest machines in the world, provides hardware support for multicast. The
basic mechanism consists of the capability for a message to be sent to any set
|