An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

Chen, Guangye; Chacon, Luis; Barnes, Daniel C

doi:10.1016/j.jcp.2012.04.040

An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

Journal Article · Sun Jan 01 04:00:00 EST 2012 · Journal of Computational Physics

DOI:https://doi.org/10.1016/j.jcp.2012.04.040· OSTI ID:1050319

Chen, Guangye ^[1]; Chacon, Luis ^[1]; Barnes, Daniel C ^[1]

ORNL

Recently, a fully implicit, energy- and charge-conserving particle-in-cell method has been developed for multi-scale, full-f kinetic simulations [G. Chen, et al., J. Comput. Phys. 230, 18 (2011)]. The method employs a Jacobian-free Newton-Krylov (JFNK) solver and is capable of using very large timesteps without loss of numerical stability or accuracy. A fundamental feature of the method is the segregation of particle orbit integrations from the field solver, while remaining fully self-consistent. This provides great flexibility, and dramatically improves the solver efficiency by reducing the degrees of freedom of the associated nonlinear system. However, it requires a particle push per nonlinear residual evaluation, which makes the particle push the most time-consuming operation in the algorithm. This paper describes a very efficient mixed-precision, hybrid CPU-GPU implementation of the implicit PIC algorithm. The JFNK solver is kept on the CPU (in double precision), while the inherent data parallelism of the particle mover is exploited by implementing it in single-precision on a graphics processing unit (GPU) using CUDA. Performance-oriented optimizations, with the aid of an analytical performance model, the roofline model, are employed. Despite being highly dynamic, the adaptive, charge-conserving particle mover algorithm achieves up to 300 400 GOp/s (including single-precision floating-point, integer, and logic operations) on a Nvidia GeForce GTX580, corresponding to 20 25% absolute GPU efficiency (against the peak theoretical performance) and 50-70% intrinsic efficiency (against the algorithm s maximum operational throughput, which neglects all latencies). This is about 200-300 times faster than an equivalent serial CPU implementation. When the single-precision GPU particle mover is combined with a double-precision CPU JFNK field solver, overall performance gains 100 vs. the double-precision CPU-only serial version are obtained, with no apparent loss of robustness or accuracy when applied to a challenging long-time scale ion acoustic wave simulation.

Research Organization:: Oak Ridge National Laboratory (ORNL)

Sponsoring Organization:: ORNL LDRD Director's R&D

DOE Contract Number:: AC05-00OR22725

OSTI ID:: 1050319

Journal Information:: Journal of Computational Physics, Journal Name: Journal of Computational Physics Journal Issue: 16 Vol. 231; ISSN JCTPAH; ISSN 0021-9991

Country of Publication:: United States

Language:: English

Similar Records

Fully implicit particle-in-cell algorithms for kinetic simulation of plasmas [Slides]

Technical Report · Thu Feb 28 23:00:00 EST 2013 · OSTI ID:1063911

Kinetic approach to microscopic-macroscopic coupling in space and laboratory plasmas

Journal Article · Mon May 15 00:00:00 EDT 2006 · Physics of Plasmas · OSTI ID:20783122

An implicit, conservative and asymptotic-preserving electrostatic particle-in-cell algorithm for arbitrarily magnetized plasmas in uniform magnetic fields

Journal Article · Sun Apr 23 00:00:00 EDT 2023 · Journal of Computational Physics · OSTI ID:1972981

Related Subjects

71 CLASSICAL AND QUANTUM MECHANICS
GENERAL PHYSICS
ACCURACY
ALGORITHMS
DEGREES OF FREEDOM
EFFICIENCY
EVALUATION
FLEXIBILITY
GPU computing
IMPLEMENTATION
ION ACOUSTIC WAVES
KINETICS
NONLINEAR PROBLEMS
PERFORMANCE
PROCESSING
SEGREGATION
SIMULATION
STABILITY
implicit particle-in-cell method

An efficient mixed-precision, hybrid CPU-GPU implementation of a nonlinearly implicit one-dimensional particle-in-cell algorithm

Citation Formats

Similar Records

Related Subjects