skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations

Abstract

Abstract—We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.

Authors:
 [1];  [2];  [3];  [3]; ORCiD logo [3];  [3];  [3]
  1. University of Tennessee (UT)
  2. University of Tennessee, Knoxville (UTK)
  3. ORNL
Publication Date:
Research Org.:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1393889
DOE Contract Number:  
AC05-00OR22725
Resource Type:
Conference
Resource Relation:
Conference: 2016 IEEE High Performance Extreme Computing Conference (HPEC'16) - Waltham, Massachusetts, United States of America - 9/13/2016 12:00:00 AM-
Country of Publication:
United States
Language:
English

Citation Formats

Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., and Haidar Ahmad, Azzam A.. Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations. United States: N. p., 2016. Web.
Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., & Haidar Ahmad, Azzam A.. Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations. United States.
Shyles, Daniel, Dongarra, Jack J., Guidry, Mike W., Tomov, Stanimire Z., Billings, Jay Jay, Brock, Benjamin A., and Haidar Ahmad, Azzam A.. Thu . "Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations". United States. doi:. https://www.osti.gov/servlets/purl/1393889.
@article{osti_1393889,
title = {Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations},
author = {Shyles, Daniel and Dongarra, Jack J. and Guidry, Mike W. and Tomov, Stanimire Z. and Billings, Jay Jay and Brock, Benjamin A. and Haidar Ahmad, Azzam A.},
abstractNote = {Abstract—We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Thu Sep 01 00:00:00 EDT 2016},
month = {Thu Sep 01 00:00:00 EDT 2016}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share: