A Framework for Lattice QCD Calculations on GPUs

Winter, Frank; Clark, M A; Edwards, Robert G; Joo, Balint

doi:10.1109/IPDPS.2014.112

Title: A Framework for Lattice QCD Calculations on GPUs

Conference · Fri Aug 01 00:00:00 EDT 2014

DOI:https://doi.org/10.1109/IPDPS.2014.112· OSTI ID:1158933

Winter, Frank; Clark, M A; Edwards, Robert G; Joo, Balint

Computing platforms equipped with accelerators like GPUs have proven to provide great computational power. However, exploiting such platforms for existing scientific applications is not a trivial task. Current GPU programming frameworks such as CUDA C/C++ require low-level programming from the developer in order to achieve high performance code. As a result porting of applications to GPUs is typically limited to time-dominant algorithms and routines, leaving the remainder not accelerated which can open a serious Amdahl's law issue. The lattice QCD application Chroma allows to explore a different porting strategy. The layered structure of the software architecture logically separates the data-parallel from the application layer. The QCD Data-Parallel software layer provides data types and expressions with stencil-like operations suitable for lattice field theory and Chroma implements algorithms in terms of this high-level interface. Thus by porting the low-level layer one can effectively move the whole application in one swing to a different platform. The QDP-JIT/PTX library, the reimplementation of the low-level layer, provides a framework for lattice QCD calculations for the CUDA architecture. The complete software interface is supported and thus applications can be run unaltered on GPU-based parallel computers. This reimplementation was possible due to the availability of a JIT compiler (part of the NVIDIA Linux kernel driver) which translates an assembly-like language (PTX) to GPU code. The expression template technique is used to build PTX code generators and a software cache manages the GPU memory. This reimplementation allows us to deploy an efficient implementation of the full gauge-generation program with dynamical fermions on large-scale GPU-based machines such as Titan and Blue Waters which accelerates the algorithm by more than an order of magnitude.

View Conference

Cite

Export

Save

Research Organization:: Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)

Sponsoring Organization:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)

DOE Contract Number:: AC05-06OR23177

OSTI ID:: 1158933

Report Number(s):: JLAB-IT-14-01; DOE/OR/23177-3190; arXiv:1408.5925; OLCF Titan through Directors Discretionary Allocation LGT006 (2012-2013); INCITE project Allocation LGT003 2012-2013; NSF Award OCI 07-25070; Research Executive Agency (REA) of the European Union Grant PITN-GA-2009-238353

Resource Relation:: Conference: 28th International Parallel and Distributed Processing Symposium, 19-23 May 2014, Phoenix, AZ

Country of Publication:: United States

Language:: English

Similar Records

QDP-JIT/PTX: A QDP++ Implementation for CUDA-Enabled GPUs

Conference · Sat Nov 01 00:00:00 EDT 2014 · Proceedings of Science · OSTI ID:1158933

Winter, Frank T.; Edwards, Robert G.

Automatic Offloading C++ Expression Templates to CUDA Enabled GPUs

Conference · Tue May 01 00:00:00 EDT 2012 · OSTI ID:1158933

Chen, Jie; Joo, Balint; Watson, William A.; +1 more

Computing the Properties of Hadrons, Nuclei, and Nuclear Matter from Quantum Chromodynamics, UNC-CH Final Report.

Technical Report · Thu Feb 15 00:00:00 EST 2018 · OSTI ID:1158933

Fowler, Robert J.

Title: A Framework for Lattice QCD Calculations on GPUs

Citation Formats

Similar Records

Related Subjects