Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Efficient On-demand Connection Management Mechanisms with PGAS Models on InfiniBand

Conference ·

In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pairwise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various microbenchmarks and benchmarks (LU Factorization, RandomAccess and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.

Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (US)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
986276
Report Number(s):
PNNL-SA-70542; KJ0402000
Country of Publication:
United States
Language:
English

Similar Records

Dynamic Time-Variant Connection Management for PGAS Models on InfiniBand
Conference · Thu Sep 01 00:00:00 EDT 2011 · OSTI ID:1024543

Designing Scalable PGAS Communication Subsystems on Cray Gemini Interconnect
Conference · Tue Dec 25 23:00:00 EST 2012 · OSTI ID:1089101

On the Suitability of MPI as a PGAS Runtime
Conference · Wed Dec 17 23:00:00 EST 2014 · OSTI ID:1194324