ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications

Gorentla Venkata, Manjunath; Graham, Richard L; Ladd, Joshua S; Shamis, Pavel; Rabinovitz, Ishai; Filipov, Vasily; Shainer, Gilad

ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications

Conference · Sat Jan 01 04:00:00 EST 2011

OSTI ID:1014252

Gorentla Venkata, Manjunath ^[1]; Graham, Richard L ^[1]; Ladd, Joshua S ^[1]; Shamis, Pavel ^[1]; Rabinovitz, Ishai ^[2]; Filipov, Vasily ^[2]; Shainer, Gilad ^[2]

ORNL
Mellanox Technologies, Inc.

This paper describes the design and implementation of InfiniBand (IB) CORE-Direct based blocking and nonblocking broadcast operations within the Cheetah collective operation framework. It describes a novel approach that fully ofFLoads collective operations and employs only user-supplied buffers. For a 64 rank communicator, the latency of CORE-Direct based hierarchical algorithm is better than production-grade Message Passing Interface (MPI) implementations, 150% better than the default Open MPI algorithm and 115% better than the shared memory optimized MVAPICH implementation for a one kilobyte (KB) message, and for eight mega-bytes (MB) it is 48% and 64% better, respectively. Flat-topology broadcast achieves 99.9% overlap in a polling based communication-computation test, and 95.1% overlap for a wait based test, compared with 92.4% and 17.0%, respectively, for a similar Central Processing Unit (CPU) based implementation.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Oak Ridge National Laboratory (ORNL)

Sponsoring Organization:: SC USDOE - Office of Science (SC)

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1014252

Country of Publication:: United States

Language:: English

Similar Records

Exploring the All-to-All Collective Optimization Space with ConnectX CORE-Direct

Conference · Sat Sep 01 00:00:00 EDT 2012 · 2012 41st International Conference on Parallel Processing; 10-13 Sept. 2012; Pittsburgh, PA, USA · OSTI ID:1567578

Overlapping Computation and Communication: Barrier Algorithms and ConnectX-2 CORE-Direct Capabilities

Conference · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:982147

Overlapping Computation and Communication: Barrier Algorithms and ConnectX-2 CORE-Direct Capabilities

Conference · Thu Dec 31 23:00:00 EST 2009 · OSTI ID:1003760

Related Subjects

99 GENERAL AND MISCELLANEOUS
ALGORITHMS
BUFFERS
COMMUNICATIONS
DESIGN
IMPLEMENTATION
PROCESSING

ConnectX-2 CORE-Direct Enabled Asynchronous Broadcast Collective Communications

Citation Formats

Similar Records

Related Subjects