A case study of CUDA FORTRAN and OpenACC for an atmospheric climate kernel
- Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
- Nvidia, Santa Clara, CA (United States)
- Cray, Seattle, WA (United States)
The porting of a key kernel in the tracer advection routines of the Community Atmosphere Model – Spectral Element (CAM-SE) to use Graphics Processing Units (GPUs) using OpenACC is considered in comparison to an existing CUDA FORTRAN port. The development of the OpenACC kernel for GPUs was substantially simpler than that of the CUDA port. Also, OpenACC performance was about 1.5× slower than the optimized CUDA version. Particular focus is given to compiler maturity regarding OpenACC implementation for modern FORTRAN, and it is found that the Cray implementation is currently more mature than the PGI implementation. Still, for the case that ran successfully on PGI, the PGI OpenACC runtime was slightly faster than Cray. The results show encouraging performance for OpenACC implementation compared to CUDA while also exposing some issues that may be necessary before the implementations are suitable for porting all of CAM-SE. Furthermore, most notable are that GPU shared memory should be used by future OpenACC implementations and that derived type support should be expanded.
- Research Organization:
- Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF); Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC)
- Grant/Contract Number:
- AC05-00OR22725
- OSTI ID:
- 1462913
- Alternate ID(s):
- OSTI ID: 1251654
- Journal Information:
- Journal of Computational Science, Journal Name: Journal of Computational Science Journal Issue: C Vol. 9; ISSN 1877-7503
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Nek5000 with OpenACC
Experiences in porting mini-applications to OpenACC and OpenMP on heterogeneous systems