skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library

Authors:
; ; ;
Publication Date:
Research Org.:
Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1335794
Report Number(s):
LLNL-CONF-695977
DOE Contract Number:
AC52-07NA27344
Resource Type:
Conference
Resource Relation:
Conference: Presented at: 2016 ANS Winter Meeting and Nuclear Technology Expo, Las Vegas, NV, United States, Nov 06 - Nov 10, 2016
Country of Publication:
United States
Language:
English
Subject:
73 NUCLEAR PHYSICS AND RADIATION PHYSICS; 97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE

Citation Formats

Bleile, R C, Brantley, P S, O'Brien, M J, and Childs, H H. Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library. United States: N. p., 2016. Web.
Bleile, R C, Brantley, P S, O'Brien, M J, & Childs, H H. Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library. United States.
Bleile, R C, Brantley, P S, O'Brien, M J, and Childs, H H. Tue . "Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library". United States. doi:. https://www.osti.gov/servlets/purl/1335794.
@article{osti_1335794,
title = {Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library},
author = {Bleile, R C and Brantley, P S and O'Brien, M J and Childs, H H},
abstractNote = {},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {Tue Jun 21 00:00:00 EDT 2016},
month = {Tue Jun 21 00:00:00 EDT 2016}
}

Conference:
Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to data-level parallelism. Event-based algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing data-level parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the event-based particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » to vectorization as a function of two parameters: the size of the particle bank and the vector width. When each event type is assumed to have constant execution time, the achievable speedup is directly related to the particle bank size. We observed that the bank size generally needs to be at least 20 times greater than vector size in order to achieve vector efficiency greater than 90%. When the execution times for events are allowed to vary, however, the vector speedup is also limited by differences in execution time for events being carried out in a single event-iteration. For some problems, this implies that vector effciencies over 50% may not be attainable. While there are many factors impacting performance of an event-based algorithm that are not captured by our model, it nevertheless provides insights into factors that may be limiting in a real implementation.« less
  • Monte Carlo particle transport is easy to implement on massively parallel computers relative to other methods of transport simulation. This paper describes experiences of implementing a realistic demonstration Monte Carlo code on a variety of parallel architectures. Our pool of tasks'' technique, which allows reproducibility from run to run regardless of the number of processors, is discussed. We present detailed timing studies of simulations performed on the 128 processor BBN-ACI TC2000 and preliminary timing results for the 32 processor Kendall Square Research KSR-1. Given sufficient workload to distribute across many computational nodes, the BBN achieves nearly linear speedup for amore » large number of nodes. The KSR, with which we have had less experience, performs poorly with more than ten processors. A simple model incorporating known causes of overhead accurately predicts observed behavior. A general-purpose communication and control package to facilitate the implementation of existing Monte Carlo packages is described together with timings on the BBN. This package adds insignificantly to the computational costs of parallel simulations.« less
  • Monte Carlo particle transport is easy to implement on massively parallel computers relative to other methods of transport simulation. This paper describes experiences of implementing a realistic demonstration Monte Carlo code on a variety of parallel architectures. Our ``pool of tasks`` technique, which allows reproducibility from run to run regardless of the number of processors, is discussed. We present detailed timing studies of simulations performed on the 128 processor BBN-ACI TC2000 and preliminary timing results for the 32 processor Kendall Square Research KSR-1. Given sufficient workload to distribute across many computational nodes, the BBN achieves nearly linear speedup for amore » large number of nodes. The KSR, with which we have had less experience, performs poorly with more than ten processors. A simple model incorporating known causes of overhead accurately predicts observed behavior. A general-purpose communication and control package to facilitate the implementation of existing Monte Carlo packages is described together with timings on the BBN. This package adds insignificantly to the computational costs of parallel simulations.« less
  • No abstract prepared.