skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect

Abstract

Gemini as the network for new Cray XE/XT systems features low latency, high bandwidth and strong scalability. Its hardware support for remote direct memory access enables efficient implementation of the global address space programming languages. Although the Generic Network Interface (GNI) is designed to support message-passing applications, it is still challenging to attain good performance for applications written in alternative programming models, such as the message-driven programming model. In our earlier work we showed that CHARM++, an object-oriented message-driven programming model, scales up to the full Jaguar Cray machine. In this paper, we describe a general and light-weight asynchronous Low-level RunTime System (LRTS) for CHARM+, and its implementation on the uGNI software stack for Cray XE systems. Several techniques are presented to exploit the uGNI capability by reducing memory copy and registration overhead, taking advantage of persistent communication, and improving intra-node communication. Our micro-benchmark results demonstrate that the uGNI-based runtime system outperforms the MPI-based implementation by up to 50% in terms of message latency. For communication intensive applications such as N-Queens, this implementation scales up to 15,360 cores of a Cray XE6 machine and is 70% faster than an MPI-based implementation. In molecular dynamics application NAMD, the performance is alsomore » considerably improved by as high as 18%.« less

Authors:
 [1];  [1];  [2];  [3];  [1]
  1. University of Illinois, Urbana-Champaign
  2. Cray, Inc.
  3. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Center for Computational Sciences
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1056922
DOE Contract Number:  
DE-AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: 26th IEEE International Parallel & Distributed Processing Symposium (IEEE IPDPS 2012), Shanghai, China, 20121110, 20121110
Country of Publication:
United States
Language:
English
Subject:
Cray XE6/XT6; Gemini Interconnect; Asynchronous message-passing; Lower Level Runtime System

Citation Formats

Sun, Yanhua, Zheng, Gengbin, Olson, Ryan M, Jones, Terry R, and Kale, Laxmikant V. A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect. United States: N. p., 2012. Web.
Sun, Yanhua, Zheng, Gengbin, Olson, Ryan M, Jones, Terry R, & Kale, Laxmikant V. A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect. United States.
Sun, Yanhua, Zheng, Gengbin, Olson, Ryan M, Jones, Terry R, and Kale, Laxmikant V. Sun . "A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect". United States.
@article{osti_1056922,
title = {A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect},
author = {Sun, Yanhua and Zheng, Gengbin and Olson, Ryan M and Jones, Terry R and Kale, Laxmikant V},
abstractNote = {Gemini as the network for new Cray XE/XT systems features low latency, high bandwidth and strong scalability. Its hardware support for remote direct memory access enables efficient implementation of the global address space programming languages. Although the Generic Network Interface (GNI) is designed to support message-passing applications, it is still challenging to attain good performance for applications written in alternative programming models, such as the message-driven programming model. In our earlier work we showed that CHARM++, an object-oriented message-driven programming model, scales up to the full Jaguar Cray machine. In this paper, we describe a general and light-weight asynchronous Low-level RunTime System (LRTS) for CHARM+, and its implementation on the uGNI software stack for Cray XE systems. Several techniques are presented to exploit the uGNI capability by reducing memory copy and registration overhead, taking advantage of persistent communication, and improving intra-node communication. Our micro-benchmark results demonstrate that the uGNI-based runtime system outperforms the MPI-based implementation by up to 50% in terms of message latency. For communication intensive applications such as N-Queens, this implementation scales up to 15,360 cores of a Cray XE6 machine and is 70% faster than an MPI-based implementation. In molecular dynamics application NAMD, the performance is also considerably improved by as high as 18%.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2012},
month = {1}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: