skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Pre-exascale accelerated application development: The ORNL Summit experience

Abstract

High-performance computing (HPC) increasingly relies on heterogeneous architectures to achieve higher performance. In the Oak Ridge Leadership Facility (OLCF), Oak Ridge, TN, USA, this trend continues as its latest supercomputer, Summit, entered production in early 2019. The combination of IBM POWER9 CPU and NVIDIA V100 GPU, along with a fast NVLink2 interconnect and other latest technologies, pushes system performance to a new height and breaks the exascale barrier by certain measures. Due to Summit's powerful GPUs and much higher GPU–CPU ratio, offloading to accelerators becomes a requirement for any application, which intends to effectively use the system. To facilitate navigating a complex landscape of competing heterogeneous architectures, a collection of applications from a wide spectrum of scientific domains is selected for early adoption on Summit. In this article, the experience and lessons learned are summarized, in the hope of providing useful guidance to address new programming challenges, such as scalability, performance portability, and software maintainability, for future application development efforts on heterogeneous HPC systems.

Authors:
 [1];  [2];  [3];  [3];  [2];  [2];  [3];  [2];  [4];  [2];  [3];  [5];  [2];  [3];  [6];  [7];  [2];  [2];  [2];  [8] more »;  [2];  [2];  [2];  [9];  [2];  [3] « less
  1. IBM Research, Oak Ridge, TN (United States)
  2. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
  3. Univ. of Groningen (The Netherlands)
  4. Univ. Rovira i Virgili, Tarragona (Spain)
  5. Univ. of Southern Denmark, Odense (Denmark)
  6. Nvidia Corporation, Oak Ridge, TN (United States)
  7. Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
  8. Georgia Inst. of Technology, Atlanta, GA (United States)
  9. Univ. of Amsterdam (Netherlands)
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
OSTI Identifier:
1649509
Grant/Contract Number:  
AC05-00OR22725
Resource Type:
Journal Article: Accepted Manuscript
Journal Name:
IBM Journal of Research and Development
Additional Journal Information:
Journal Volume: 64; Journal Issue: 3/4; Journal ID: ISSN 0018-8646
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; High performance computing; accelerated computing; GPU

Citation Formats

Luo, L., P. Straatsma, T., Suarez, L. Aguilar, Broer, R., Bykov, D., D'Azevedo, E. F., S. Faraji, S., C. Gottiparthi, K., De Graaf, C., A. Harris, J., A. Havenith, R. W., Jensen, H. Aa., Joubert, W., K. Kathir, R., Larkin, J., W. Li, Y., I. Lyakh, D., B. Messer, O. E., R. Norman, M., C. Oefelein, J., Sankaran, R., F. Tillack, A., L. Barnes, A., Visscher, L., C. Wells, J., and Wibowo, M. Pre-exascale accelerated application development: The ORNL Summit experience. United States: N. p., 2020. Web. doi:10.1147/jrd.2020.2965881.
Luo, L., P. Straatsma, T., Suarez, L. Aguilar, Broer, R., Bykov, D., D'Azevedo, E. F., S. Faraji, S., C. Gottiparthi, K., De Graaf, C., A. Harris, J., A. Havenith, R. W., Jensen, H. Aa., Joubert, W., K. Kathir, R., Larkin, J., W. Li, Y., I. Lyakh, D., B. Messer, O. E., R. Norman, M., C. Oefelein, J., Sankaran, R., F. Tillack, A., L. Barnes, A., Visscher, L., C. Wells, J., & Wibowo, M. Pre-exascale accelerated application development: The ORNL Summit experience. United States. https://doi.org/10.1147/jrd.2020.2965881
Luo, L., P. Straatsma, T., Suarez, L. Aguilar, Broer, R., Bykov, D., D'Azevedo, E. F., S. Faraji, S., C. Gottiparthi, K., De Graaf, C., A. Harris, J., A. Havenith, R. W., Jensen, H. Aa., Joubert, W., K. Kathir, R., Larkin, J., W. Li, Y., I. Lyakh, D., B. Messer, O. E., R. Norman, M., C. Oefelein, J., Sankaran, R., F. Tillack, A., L. Barnes, A., Visscher, L., C. Wells, J., and Wibowo, M. 2020. "Pre-exascale accelerated application development: The ORNL Summit experience". United States. https://doi.org/10.1147/jrd.2020.2965881. https://www.osti.gov/servlets/purl/1649509.
@article{osti_1649509,
title = {Pre-exascale accelerated application development: The ORNL Summit experience},
author = {Luo, L. and P. Straatsma, T. and Suarez, L. Aguilar and Broer, R. and Bykov, D. and D'Azevedo, E. F. and S. Faraji, S. and C. Gottiparthi, K. and De Graaf, C. and A. Harris, J. and A. Havenith, R. W. and Jensen, H. Aa. and Joubert, W. and K. Kathir, R. and Larkin, J. and W. Li, Y. and I. Lyakh, D. and B. Messer, O. E. and R. Norman, M. and C. Oefelein, J. and Sankaran, R. and F. Tillack, A. and L. Barnes, A. and Visscher, L. and C. Wells, J. and Wibowo, M.},
abstractNote = {High-performance computing (HPC) increasingly relies on heterogeneous architectures to achieve higher performance. In the Oak Ridge Leadership Facility (OLCF), Oak Ridge, TN, USA, this trend continues as its latest supercomputer, Summit, entered production in early 2019. The combination of IBM POWER9 CPU and NVIDIA V100 GPU, along with a fast NVLink2 interconnect and other latest technologies, pushes system performance to a new height and breaks the exascale barrier by certain measures. Due to Summit's powerful GPUs and much higher GPU–CPU ratio, offloading to accelerators becomes a requirement for any application, which intends to effectively use the system. To facilitate navigating a complex landscape of competing heterogeneous architectures, a collection of applications from a wide spectrum of scientific domains is selected for early adoption on Summit. In this article, the experience and lessons learned are summarized, in the hope of providing useful guidance to address new programming challenges, such as scalability, performance portability, and software maintainability, for future application development efforts on heterogeneous HPC systems.},
doi = {10.1147/jrd.2020.2965881},
url = {https://www.osti.gov/biblio/1649509}, journal = {IBM Journal of Research and Development},
issn = {0018-8646},
number = 3/4,
volume = 64,
place = {United States},
year = {Fri May 01 00:00:00 EDT 2020},
month = {Fri May 01 00:00:00 EDT 2020}
}