skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: A Framework for Lattice QCD Calculations on GPUs

Conference ·

Computing platforms equipped with accelerators like GPUs have proven to provide great computational power. However, exploiting such platforms for existing scientific applications is not a trivial task. Current GPU programming frameworks such as CUDA C/C++ require low-level programming from the developer in order to achieve high performance code. As a result porting of applications to GPUs is typically limited to time-dominant algorithms and routines, leaving the remainder not accelerated which can open a serious Amdahl's law issue. The lattice QCD application Chroma allows to explore a different porting strategy. The layered structure of the software architecture logically separates the data-parallel from the application layer. The QCD Data-Parallel software layer provides data types and expressions with stencil-like operations suitable for lattice field theory and Chroma implements algorithms in terms of this high-level interface. Thus by porting the low-level layer one can effectively move the whole application in one swing to a different platform. The QDP-JIT/PTX library, the reimplementation of the low-level layer, provides a framework for lattice QCD calculations for the CUDA architecture. The complete software interface is supported and thus applications can be run unaltered on GPU-based parallel computers. This reimplementation was possible due to the availability of a JIT compiler (part of the NVIDIA Linux kernel driver) which translates an assembly-like language (PTX) to GPU code. The expression template technique is used to build PTX code generators and a software cache manages the GPU memory. This reimplementation allows us to deploy an efficient implementation of the full gauge-generation program with dynamical fermions on large-scale GPU-based machines such as Titan and Blue Waters which accelerates the algorithm by more than an order of magnitude.

Research Organization:
Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
DOE Contract Number:
AC05-06OR23177
OSTI ID:
1158933
Report Number(s):
JLAB-IT-14-01; DOE/OR/23177-3190; arXiv:1408.5925; OLCF Titan through Directors Discretionary Allocation LGT006 (2012-2013); INCITE project Allocation LGT003 2012-2013; NSF Award OCI 07-25070; Research Executive Agency (REA) of the European Union Grant PITN-GA-2009-238353
Resource Relation:
Conference: 28th International Parallel and Distributed Processing Symposium, 19-23 May 2014, Phoenix, AZ
Country of Publication:
United States
Language:
English

Similar Records

QDP-JIT/PTX: A QDP++ Implementation for CUDA-Enabled GPUs
Conference · Sat Nov 01 00:00:00 EDT 2014 · Proceedings of Science · OSTI ID:1158933

Automatic Offloading C++ Expression Templates to CUDA Enabled GPUs
Conference · Tue May 01 00:00:00 EDT 2012 · OSTI ID:1158933

Computing the Properties of Hadrons, Nuclei, and Nuclear Matter from Quantum Chromodynamics, UNC-CH Final Report.
Technical Report · Thu Feb 15 00:00:00 EST 2018 · OSTI ID:1158933

Related Subjects