SeaStar Unchained: Multiplying the Performance of the Cray SeaStar Network
- ORNL
The Oak Ridge Leadership Computing Facility (OLCF) supports many different systems and many differ- ent interconnects. The only common programming interfaces across these systems are BSD Sockets and MPI. Due to the design assumptions such as implicit buffering leading to extra copies, Sockets performance is almost universally lower than the native interface. Even in the cases that Sockets provides similar bandwidth as the native interface, it suffers from excessive CPU usage. MPI is the de-facto interface for intra- job communication, but is difficult to use between jobs and provides no ability to communicate with service nodes or off- system nodes (e.g. for I/O forwarding). We have developed the Common Communication Interface (CCI), a programming interface that exposes the advances in interconnect hardware, notably Remote Direct Memory Access (RDMA) and operating system (OS) bypass, while imposing as little overhead as possible. This API directly supports inter-job as well as off- system communication. CCI is a lightweight abstraction layer that provides point-to-point messaging and remote memory access. The Cray SeaStar ASIC, with its programmable embedded processor, provides an excellent platform to investigate the properties of various network protocols and programming interfaces. This paper describes our native implementation CCI on the SeaStar platform, and details how we implemented full OS bypass for common operations. We demonstrate a 30% to 50% reduction in latency, more than a six-fold increase in message injection rate, and an almost 7x improvement in bandwidth for small message sizes when compared to the generic Cray Portals implementation.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). National Center for Computational Sciences (NCCS)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 1084417
- Resource Relation:
- Conference: Cray User Group, Napa, CA, USA, 20130506, 20130509
- Country of Publication:
- United States
- Language:
- English
Similar Records
A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect
Evaluating the Potential of Cray Gemini Interconnect for PGAS Communication Runtime Systems