Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Performance Portability of a Wilson Dslash Stencil Operator Mini-App Using Kokkos and SYCL

Conference ·

We describe our experiences in creating mini-apps for the Wilson-Dslash stencil operator for Lattice Quantum Chromodynamics using the Kokkos and SYCL programming models. In particular we comment on the performance achieved on a variety of hardware architectures, limitations we have reached in both programming models and how these have been resolved by us, or may be resolved by the developers of these models.

Research Organization:
Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Nuclear Physics (NP)
DOE Contract Number:
AC05-06OR23177
OSTI ID:
1976171
Report Number(s):
JLAB-CIO-19-3085; DOE/OR/23177-4806
Resource Relation:
Conference: 2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)
Country of Publication:
United States
Language:
English

References (11)

The Chroma Software System for Lattice QCD March 2005
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns December 2014
The BlueGene/Q supercomputer December 2012
Kokkos Array performance-portable manycore programming model January 2012
High-performance lattice QCD for multi-core based parallel systems using a cache-friendly hybrid threaded-MPI approach November 2011
LLVM: A compilation framework for lifelong program analysis & transformation January 2004
A Framework for Lattice QCD Calculations on GPUs
  • No authors listed
  • 2014 IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2014 IEEE 28th International Parallel and Distributed Processing Symposium https://doi.org/10.1109/IPDPS.2014.112
May 2014
Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics
  • No authors listed
  • 2010 SC - International Conference for High Performance Computing, Networking, Storage and Analysis, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2010.40
November 2010
Solving lattice QCD systems of equations using mixed precision solvers on GPUs September 2010
Lattice QCD on Intel® Xeon PhiTM Coprocessors January 2013
Modeling Explicit SIMD Programming With Subgroup Functions May 2017