skip to main content


This content will become publicly available on December 22, 2018

Title: Multigroup Monte Carlo on GPUs: Comparison of history- and event-based algorithms

This article presents an investigation of the performance of different multigroup Monte Carlo transport algorithms on GPUs with a discussion of both history-based and event-based approaches. Several algorithmic improvements are introduced for both approaches. By modifying the history-based algorithm that is traditionally favored in CPU-based MC codes to occasionally filter out dead particles to reduce thread divergence, performance exceeds that of either the pure history-based or event-based approaches. The impacts of several algorithmic choices are discussed, including performance studies on Kepler and Pascal generation NVIDIA GPUs for fixed source and eigenvalue calculations. Single-device performance equivalent to 20–40 CPU cores on the K40 GPU and 60–80 CPU cores on the P100 GPU is achieved. Last, in addition, nearly perfect multi-device parallel weak scaling is demonstrated on more than 16,000 nodes of the Titan supercomputer.
ORCiD logo [1] ; ORCiD logo [1] ; ORCiD logo [1]
  1. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Publication Date:
Grant/Contract Number:
Accepted Manuscript
Journal Name:
Annals of Nuclear Energy (Oxford)
Additional Journal Information:
Journal Name: Annals of Nuclear Energy (Oxford); Journal Volume: 113; Journal Issue: C; Journal ID: ISSN 0306-4549
Research Org:
Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)
Sponsoring Org:
USDOE National Nuclear Security Administration (NNSA); USDOE Office of Science (SC)
Country of Publication:
United States
97 MATHEMATICS AND COMPUTING; Radiation transport; Monte Carlo; GPU
OSTI Identifier: