DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Porting the WAVEWATCH III (v6.07) wave action source terms to GPU

Journal Article · · Geoscientific Model Development (Online)

Surface gravity waves play a critical role in several processes, including mixing, coastal inundation, and surface fluxes. Despite the growing literature on the importance of ocean surface waves, wind–wave processes have traditionally been excluded from Earth system models (ESMs) due to the high computational costs of running spectral wave models. The development of the Next Generation Ocean Model for the DOE’s (Department of Energy) E3SM (Energy Exascale Earth System Model) Project partly focuses on the inclusion of a wave model, WAVEWATCH III (WW3), into E3SM. WW3, which was originally developed for operational wave forecasting, needs to be computationally less expensive before it can be integrated into ESMs. To accomplish this, we take advantage of heterogeneous architectures at DOE leadership computing facilities and the increasing computing power of general-purpose graphics processing units (GPUs). This paper identifies the wave action source terms, W3SRCEMD, as the most computationally intensive module in WW3 and then accelerates them via GPU. Our experiments on two computing platforms, Kodiak (P100 GPU and Intel(R) Xeon(R) central processing unit, CPU, E5-2695 v4) and Summit (V100 GPU and IBM POWER9 CPU) show respective average speedups of 2× and 4× when mapping one Message Passing Interface (MPI) per GPU. An average speedup of 1.4× was achieved using all 42 CPU cores and 6 GPUs on a Summit node (with 7 MPI ranks per GPU). However, the GPU speedup over the 42 CPU cores remains relatively unchanged (~1.3×) even when using 4 MPI ranks per GPU (24 ranks in total) and 3 MPI ranks per GPU (18 ranks in total). This corresponds to a 35 %–40 % decrease in both simulation time and usage of resources. Due to too many local scalars and arrays in the W3SRCEMD subroutine and the huge WW3 memory requirement, GPU performance is currently limited by the data transfer bandwidth between the CPU and the GPU. Ideally, OpenACC routine directives could be used to further improve performance. However, W3SRCEMD would require significant code refactoring to make this possible. We also discuss how the trade-off between the occupancy, register, and latency affects the GPU performance of WW3.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States); Los Alamos National Laboratory (LANL), Los Alamos, NM (United States); Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Biological and Environmental Research (BER); USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC), Office of Biological & Environmental Research (BER)
Grant/Contract Number:
AC05-00OR22725; 89233218CNA000001; ESMD-SFA; AC02-06CH11357
OSTI ID:
1959840
Alternate ID(s):
OSTI ID: 1965244; OSTI ID: 1969236; OSTI ID: 2382742
Report Number(s):
LA-UR-22-24512
Journal Information:
Geoscientific Model Development (Online), Vol. 16, Issue 4; ISSN 1991-9603
Publisher:
Copernicus Publications, EGUCopyright Statement
Country of Publication:
United States
Language:
English

References (35)

GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model journal September 2013
Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors journal October 2017
Wave effects in global ocean modeling: parametrizations vs. forcing from a wave model journal September 2018
The WAM Model—A Third Generation Ocean Wave Prediction Model journal December 1988
Sequential Performance Analysis with Callgrind and KCachegrind book January 2008
Semiempirical Dissipation Source Functions for Ocean Waves. Part I: Definition, Calibration, and Validation journal September 2010
Towards multiscale modeling of ocean surface turbulent mixing using coupled MPAS-Ocean v6.3 and PALM v5.0 journal April 2021
Response of the equatorial basin-wide SST to non-breaking surface wave-induced mixing in a climate model: An amendment to tropical bias journal July 2012
Optimizing high-resolution Community Earth System Model on a heterogeneous many-core supercomputing platform journal January 2020
An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code
  • Shimokawabe, Takashi; Aoki, Takayuki; Muroi, Chiashi
  • 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.9
conference November 2010
A mosaic approach to wind wave modeling journal January 2008
Unprecedented cloud resolution in a GPU-enabled full-physics atmospheric climate simulation on OLCF’s summit supercomputer journal July 2021
Porting LASG/ IAP Climate System Ocean Model to Gpus Using OpenAcc journal January 2019
FIO‐ESM Version 2.0: Model Description and Evaluation journal June 2020
Distributed-memory concepts in the wave model WAVEWATCH III journal January 2002
GPU-Accelerated Multi-Profile Radiative Transfer Model for the Infrared Atmospheric Sounding Interferometer journal September 2011
Langmuir mixing effects on global climate: WAVEWATCH III in CESM journal July 2016
POM.gpu-v1.0: a GPU-based Princeton Ocean Model journal January 2015
Large-scale hurricane modeling using domain decomposition parallelization and implicit scheme implemented in WAVEWATCH III wave model journal April 2020
A Graphics Processing Unit (GPU) Approach to Large Eddy Simulation (LES) for Transport and Contaminant Dispersion journal July 2021
FUNWAVE‐GPU: Multiple‐GPU Acceleration of a Boussinesq‐Type Wave Model journal May 2020
A Performance-Portable Nonhydrostatic Atmospheric Dycore for the Energy Exascale Earth System Model Running at Cloud-Resolving Resolutions. conference November 2020
Development and evaluation of an Earth System Model with surface gravity waves: Earth System Model With Wave journal September 2013
Wind Waves in the Coupled Climate System journal November 2012
FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm journal September 2011
Long-term impacts of ocean wave-dependent roughness on global climate systems: W-AGCM journal March 2017
The Community Earth System Model Version 2 (CESM2) journal February 2020
Hindcast of Waves and Currents in Hurricane Katrina journal April 2008
Impacts of Parameterized Langmuir Turbulence and Nonbreaking Wave Mixing in Global Climate Simulations journal June 2014
A numerical investigation of the oceanic general circulation journal February 1967
The Operational Implementation of a Great Lakes Wave Forecasting System at NOAA/NCEP* journal December 2014
Unstructured global to coastal wave modeling for the Energy Exascale Earth System Model using WAVEWATCH III version 6.07 journal May 2021
Propagation of ocean surface waves on a spherical multiple-cell grid journal October 2012
Validation of a thirty year wave hindcast using the Climate Forecast System Reanalysis winds journal October 2013
A Multigrid Wave Forecasting Model: A New Paradigm in Operational Wave Forecasting journal July 2013

Figures / Tables (12)