skip to main content

SciTech ConnectSciTech Connect

Title: Accelerating Twisted Mass LQCD with QPhiX

We present the implementation of twisted mass fermion operators for the QPhiX library. We analyze the performance on the Intel Xeon Phi (Knights Corner) coprocessor as well as on Intel Xeon Haswell CPUs. In particular, we demonstrate that on the Xeon Phi 7120P the Dslash kernel is able to reach 80\% of the theoretical peak bandwidth, while on a Xeon Haswell E5-2630 CPU our generated code for the Dslash operator with AVX2 instructions outperforms the corresponding implementation in the tmLQCD library by a factor of $$\sim 5\times$$ in single precision. We strong scale the code up to 6.8 (14.1) Tflops in single (half) precision on 64 Xeon Haswell CPUs.
Authors:
 [1] ;  [1] ;  [2]
  1. INFN, Rome3
  2. Fermilab
Publication Date:
OSTI Identifier:
1250488
Report Number(s):
arXiv:1510.08879; FERMILAB-CONF-15-528-CD
1402095
DOE Contract Number:
AC02-07CH11359
Resource Type:
Conference
Resource Relation:
Journal Name: PoS; Journal Volume: LATTICE2015; Conference: 33rd International Symposium on Lattice Field Theory, Kobe, Japan, 07/14-07/18/2015
Research Org:
Fermi National Accelerator Laboratory (FNAL), Batavia, IL (United States)
Sponsoring Org:
USDOE Office of Science (SC), High Energy Physics (HEP) (SC-25)
Country of Publication:
United States
Language:
English
Subject:
72 PHYSICS OF ELEMENTARY PARTICLES AND FIELDS