skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Nekbone performance on GPUs with OpenACC and CUDA Fortran implementations

Journal Article · · Journal of Supercomputing

We present a hybrid GPU implementation and performance analysis of Nekbone, which represents one of the core kernels of the incompressible Navier–Stokes solver Nek5000. The implementation is based on OpenACC and CUDA Fortran for local parallelization of the compute-intensive matrix–matrix multiplication part, which significantly minimizes the modification of the existing CPU code while extending the simulation capability of the code to GPU architectures. Our discussion includes the GPU results of OpenACC interoperating with CUDA Fortran and the gather–scatter operations with GPUDirect communication. We demonstrate performance of up to 552 Tflops on 16, 384 GPUs of the OLCF Cray XK7 Titan.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF)
Sponsoring Organization:
USDOE Office of Science (SC)
DOE Contract Number:
AC02-06CH11357; AC05-00OR22725
OSTI ID:
1565549
Journal Information:
Journal of Supercomputing, Vol. 72, Issue 11; ISSN 0920-8542
Publisher:
Springer
Country of Publication:
United States
Language:
English

References (11)

An MPI/OpenACC implementation of a high-order electromagnetics solver with GPUDirect communication journal July 2016
CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application
  • Hoshino, T.; Maruyama, N.; Matsuoka, S.
  • 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing https://doi.org/10.1109/CCGrid.2013.12
conference May 2013
Accelerating a C++ CFD Code with OpenACC conference November 2014
OpenACC acceleration of an unstructured CFD solver based on a reconstructed discontinuous Galerkin method for compressible flows: OpenACC IMPLEMENTATION OF AN UNSTRUCTURED CFD SOLVER journal February 2015
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics journal September 2013
Hybrid Multigrid/Schwarz Algorithms for the Spectral Element Method journal July 2005
Petascale algorithms for reactor hydrodynamics journal July 2008
Fast Parallel Direct Solvers for Coarse Grid Problems journal February 2001
High-Order Methods for Incompressible Fluid Flow book January 2009
OpenACC acceleration of the Nek5000 spectral element code journal March 2015
Nek5000 with OpenACC book January 2015

Similar Records

OpenACC acceleration of the Nek5000 spectral element code
Journal Article · Mon Mar 30 00:00:00 EDT 2015 · International Journal of High Performance Computing Applications · OSTI ID:1565549

Nek5000 with OpenACC
Journal Article · · Lecture Notes in Computer Science · OSTI ID:1565549

A case study of CUDA FORTRAN and OpenACC for an atmospheric climate kernel
Journal Article · Sat Apr 18 00:00:00 EDT 2015 · Journal of Computational Science · OSTI ID:1565549

Related Subjects