skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6

Abstract

Achieving good scaling for fine-grained communication intensive applications on modern supercomputers remains challenging. In our previous work, we have shown that such an application -- NAMD -- scales well on the full Jaguar XT5 without long-range interactions; Yet, with them, the speedup falters beyond 64K cores. Although the new Gemini interconnect on Cray XK6 has improved network performance, the challenges remain, and are likely to remain for other such networks as well. We analyze communication bottlenecks in NAMD and its CHARM++ runtime, using the Projections performance analysis tool. Based on the analysis, we optimize the runtime, built on the uGNI library for Gemini. We present several techniques to improve the fine-grained communication. Consequently, the performance of running 92224-atom Apoa1 with GPUs on TitanDev is improved by 36%. For 100-million-atom STMV, we improve upon the prior Jaguar XT5 result of 26 ms/step to 13 ms/step using 298,992 cores on Jaguar XK6.

Authors:
; ; ; ; ; ;
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1567605
Resource Type:
Conference
Journal Name:
SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis; 10-16 Nov. 2012; Salt Lake City, UT, USA
Additional Journal Information:
Conference: SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Country of Publication:
United States
Language:
English

Citation Formats

Sun, Yanhua, Zheng, Gengbin, Mei, Chao, Bohm, Eric J., Phillips, James C., Kale, Laximant V., and Jones, Terry R. Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6. United States: N. p., 2012. Web. doi:10.1109/SC.2012.87.
Sun, Yanhua, Zheng, Gengbin, Mei, Chao, Bohm, Eric J., Phillips, James C., Kale, Laximant V., & Jones, Terry R. Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6. United States. doi:10.1109/SC.2012.87.
Sun, Yanhua, Zheng, Gengbin, Mei, Chao, Bohm, Eric J., Phillips, James C., Kale, Laximant V., and Jones, Terry R. Thu . "Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6". United States. doi:10.1109/SC.2012.87.
@article{osti_1567605,
title = {Optimizing fine-grained communication in a biomolecular simulation application on Cray XK6},
author = {Sun, Yanhua and Zheng, Gengbin and Mei, Chao and Bohm, Eric J. and Phillips, James C. and Kale, Laximant V. and Jones, Terry R.},
abstractNote = {Achieving good scaling for fine-grained communication intensive applications on modern supercomputers remains challenging. In our previous work, we have shown that such an application -- NAMD -- scales well on the full Jaguar XT5 without long-range interactions; Yet, with them, the speedup falters beyond 64K cores. Although the new Gemini interconnect on Cray XK6 has improved network performance, the challenges remain, and are likely to remain for other such networks as well. We analyze communication bottlenecks in NAMD and its CHARM++ runtime, using the Projections performance analysis tool. Based on the analysis, we optimize the runtime, built on the uGNI library for Gemini. We present several techniques to improve the fine-grained communication. Consequently, the performance of running 92224-atom Apoa1 with GPUs on TitanDev is improved by 36%. For 100-million-atom STMV, we improve upon the prior Jaguar XT5 result of 26 ms/step to 13 ms/step using 298,992 cores on Jaguar XK6.},
doi = {10.1109/SC.2012.87},
journal = {SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis; 10-16 Nov. 2012; Salt Lake City, UT, USA},
number = ,
volume = ,
place = {United States},
year = {2012},
month = {11}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: