skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations

Abstract

Abstract—We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.

Authors:
 [1];  [2];  [3];  [3]; ORCiD logo [3];  [3];  [3]
  1. University of Tennessee (UT)
  2. University of Tennessee, Knoxville (UTK)
  3. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1393889
DOE Contract Number:
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: 2016 IEEE High Performance Extreme Computing Conference (HPEC'16) - Waltham, Massachusetts, United States of America - 9/13/2016 12:00:00 AM-
Country of Publication:
United States
Language:
English

Citation Formats

Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., and Haidar Ahmad, Azzam A. Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations. United States: N. p., 2016. Web.
Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., & Haidar Ahmad, Azzam A. Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations. United States.
Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., and Haidar Ahmad, Azzam A. 2016. "Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations". United States. doi:. https://www.osti.gov/servlets/purl/1393889.
@article{osti_1393889,
title = {Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations},
author = {Shyles, Daniel and Dongarra, Jack J. and Guidry, Mike W. and Tomov, Stanimire Z. and Billings, Jay Jay and Brock, Benjamin A. and Haidar Ahmad, Azzam A.},
abstractNote = {Abstract—We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 9
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Cited by 1
  • In this study, we demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff thermonuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. In addition, this orders-of-magnitude decrease in computation time for solving systems of realistic kinetic networks implies thatmore » important coupled, multiphysics problems in various scientific and technical fields that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.« less
  • We demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff thermonuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. This orders-of-magnitude decrease in computation time for solving systems of realistic kinetic networks implies that important coupled, multiphysics problems inmore » various scientific and technical fields that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.« less
  • A method is proposed for measuring the effective reproduction factor, k, in subcritical systems. The method uses the transient response of a subcritical system to the sudden removal of an extraneous neutron source (i.e., a source jerk). The response is analyzed using an inverse kinetic technique that least-squares fits the exact analytical solution corresponding to a source-jerk transient as derived from the point-reactor model. It has been found that the technique can provide an accurate means of measuring k in systems that are close to critical (i.e., 0.95 < k < 1.0). As a system becomes more subcritical (i.e., kmore » << 1.0) spatial effects can introduce significant biases depending on the source and detector positions. However, methods are available that can correct for these biases and, hence, can allow measuring subcriticality in systems with k as low as 0.5. 12 refs., 3 figs.« less
  • Numerical methods are applied to a coupled system of one-dimensional unsteady reaction-diffusion equations to seek propagating wave solutions. These equations model flame propagation in certain combustion systems when constant pressure combustion is assumed, with Lewis number unity, and when a Lagrangian coordinate transformation is introduced. In the numerical integration schemes the reaction terms are computed non-iteratively, using two different second-order accurate methods. In the first, the numerical timestep is limited by the Lipschitz timestep constraint as in the case with standard explicit schemes. The second scheme is constructed in such a way that this timestep limitation is not present andmore » computations are made with timesteps which are orders of magnitude larger than those possible with standard methods. 9 refs.« less