skip to main content
DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model

Abstract

Abstract. We present an architecture-portable and performant implementation of the atmospheric dynamical core (High-Order Methods Modeling Environment, HOMME) of the Energy Exascale Earth System Model (E3SM). The original Fortran implementation is highly performant and scalable on conventional architectures using the Message Passing Interface(MPI) and Open MultiProcessor (OpenMP) programming models.We rewrite the model in C++ and use the Kokkos library to express on-node parallelism in a largely architecture-independent implementation. Kokkos provides an abstraction of a compute node or device,layout-polymorphic multidimensional arrays, and parallel execution constructs. The new implementation achieves the same or better performance on conventional multicore computers and is portable to GPUs. We present performance data for the original and new implementations on multiple platforms, on up to 5400 compute nodes, and study several aspects of the single- and multi-node performance characteristics of the new implementation on conventional CPU (e.g., Intel Xeon), many core CPU (e.g., Intel Xeon Phi Knights Landing),and Nvidia V100 GPU.

Authors:
ORCiD logo [1]; ORCiD logo [1];  [1];  [1];  [1];  [1]; ORCiD logo [1];  [1]
  1. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sponsoring Org.:
USDOE Office of Science (SC)
OSTI Identifier:
1529244
Grant/Contract Number:  
NA0003525; AC02-05CH11231
Resource Type:
Accepted Manuscript
Journal Name:
Geoscientific Model Development (Online)
Additional Journal Information:
Journal Name: Geoscientific Model Development (Online); Journal Volume: 12; Journal Issue: 4; Journal ID: ISSN 1991-9603
Publisher:
European Geosciences Union
Country of Publication:
United States
Language:
English

Citation Formats

Bertagna, Luca, Deakin, Michael, Guba, Oksana, Sunderland, Daniel, Bradley, Andrew M., Tezaur, Irina K., Taylor, Mark A., and Salinger, Andrew G. HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model. United States: N. p., 2019. Web. doi:10.5194/gmd-12-1423-2019.
Bertagna, Luca, Deakin, Michael, Guba, Oksana, Sunderland, Daniel, Bradley, Andrew M., Tezaur, Irina K., Taylor, Mark A., & Salinger, Andrew G. HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model. United States. doi:10.5194/gmd-12-1423-2019.
Bertagna, Luca, Deakin, Michael, Guba, Oksana, Sunderland, Daniel, Bradley, Andrew M., Tezaur, Irina K., Taylor, Mark A., and Salinger, Andrew G. Thu . "HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model". United States. doi:10.5194/gmd-12-1423-2019. https://www.osti.gov/servlets/purl/1529244.
@article{osti_1529244,
title = {HOMMEXX 1.0: a performance-portable atmospheric dynamical core for the Energy Exascale Earth System Model},
author = {Bertagna, Luca and Deakin, Michael and Guba, Oksana and Sunderland, Daniel and Bradley, Andrew M. and Tezaur, Irina K. and Taylor, Mark A. and Salinger, Andrew G.},
abstractNote = {Abstract. We present an architecture-portable and performant implementation of the atmospheric dynamical core (High-Order Methods Modeling Environment, HOMME) of the Energy Exascale Earth System Model (E3SM). The original Fortran implementation is highly performant and scalable on conventional architectures using the Message Passing Interface(MPI) and Open MultiProcessor (OpenMP) programming models.We rewrite the model in C++ and use the Kokkos library to express on-node parallelism in a largely architecture-independent implementation. Kokkos provides an abstraction of a compute node or device,layout-polymorphic multidimensional arrays, and parallel execution constructs. The new implementation achieves the same or better performance on conventional multicore computers and is portable to GPUs. We present performance data for the original and new implementations on multiple platforms, on up to 5400 compute nodes, and study several aspects of the single- and multi-node performance characteristics of the new implementation on conventional CPU (e.g., Intel Xeon), many core CPU (e.g., Intel Xeon Phi Knights Landing),and Nvidia V100 GPU.},
doi = {10.5194/gmd-12-1423-2019},
journal = {Geoscientific Model Development (Online)},
number = 4,
volume = 12,
place = {United States},
year = {2019},
month = {4}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Save / Share: