Collective Framework and Performance Optimizations to Open MPI for Cray XT Platforms
- ORNL
The performance and scalability of collective operations plays a key role in the performance and scalability of many scientific applications. Within the Open MPI code base we have developed a general purpose hierarchical collective operations framework called Cheetah, and applied it at large scale on the Oak Ridge Leadership Computing Facility's Jaguar (OLCF) platform, obtaining better performance and scalability than the native MPI implementation. This paper discuss Cheetah's design and implementation, and optimizations to the framework for Cray XT 5 platforms. Our results show that the Cheetah's Broadcast and Barrier perform better than the native MPI implementation. For medium data, the Cheetah's Broadcast outperforms the native MPI implementation by 93% for 49,152 processes problem size. For small and large data, it out performs the native MPI implementation by 10% and 9%, respectively, at 24,576 processes problem size. The Cheetah's Barrier performs 10% better than the native MPI implementation for 12,288 processes problem size.
- Research Organization:
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- DOE Contract Number:
- DE-AC05-00OR22725
- OSTI ID:
- 1035529
- Resource Relation:
- Conference: 53rd Cray User Group meeting, CUG 2011, Fairbanks, AK, USA, 20110523, 20110526
- Country of Publication:
- United States
- Language:
- English
Similar Records
SeaStar Unchained: Multiplying the Performance of the Cray SeaStar Network
An Evaluation of Open MPI's Matching Transport Layer on the Cray XT