Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms
Journal Article
·
· Parallel Computing
- Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)
- Kookmin Univ., Seoul (Korea, Republic of)
- Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States)
The next decade of high-performance computing (HPC) systems will see a rapid evolution and divergence of multi- and manycore architectures as power and cooling constraints limit increases in microprocessor clock speeds. Understanding efficient optimization methodologies on diverse multicore designs in the context of demanding numerical methods is one of the greatest challenges faced today by the HPC community. In this paper, we examine the efficient multicore optimization of GTC, a petascale gyrokinetic toroidal fusion code for studying plasma microturbulence in tokamak devices. For GTC’s key computational components (charge deposition and particle push), we explore efficient parallelization strategies across a broad range of emerging multicore designs, including the recently-released Intel Nehalem-EX, the AMD Opteron Istanbul, and the highly multithreaded Sun UltraSparc T2+. We also present the first study on tuning gyrokinetic particle-in-cell (PIC) algorithms for graphics processors, using the NVIDIA C2050 (Fermi). Our work discusses several novel optimization approaches for gyrokinetic PIC, including mixed-precision computation, particle binning and decomposition strategies, grid replication, SIMDized atomic floating-point operations, and effective GPU texture memory utilization. Overall, we achieve significant performance improvements of 1.3–4.7× on these complex PIC kernels, despite the inherent challenges of data dependency and locality. Finally, our work also points to several architectural and programming features that could significantly enhance PIC performance and productivity on next-generation architectures.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- Intel Corporation (United States); Microsoft Corporation (United States); National Research Foundation of Korea (NRF); USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21); USDOE Office of Science (SC), Fusion Energy Sciences (FES) (SC-24)
- Grant/Contract Number:
- AC02-05CH11231; AC02-09CH11466
- OSTI ID:
- 1407105
- Journal Information:
- Parallel Computing, Journal Name: Parallel Computing Journal Issue: 9 Vol. 37; ISSN 0167-8191
- Publisher:
- ElsevierCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems
Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Conference
·
Fri Dec 31 23:00:00 EST 2010
·
OSTI ID:1407109
Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors
Conference
·
Wed Dec 31 23:00:00 EST 2008
·
OSTI ID:1407082
Approximate Weighted Matching On Emerging Manycore and Multithreaded Architectures
Journal Article
·
Thu Nov 29 23:00:00 EST 2012
· International Journal of High Performance Computing Applications, 26 (4 ):413-430
·
OSTI ID:1057347