Algorithmic Improvements for Portable EventBased Monte Carlo Transport Using the Nvidia Thrust Library
 Authors:
 Publication Date:
 Research Org.:
 Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
 Sponsoring Org.:
 USDOE
 OSTI Identifier:
 1335794
 Report Number(s):
 LLNLCONF695977
 DOE Contract Number:
 AC5207NA27344
 Resource Type:
 Conference
 Resource Relation:
 Conference: Presented at: 2016 ANS Winter Meeting and Nuclear Technology Expo, Las Vegas, NV, United States, Nov 06  Nov 10, 2016
 Country of Publication:
 United States
 Language:
 English
 Subject:
 73 NUCLEAR PHYSICS AND RADIATION PHYSICS; 97 MATHEMATICS, COMPUTING, AND INFORMATION SCIENCE
Citation Formats
Bleile, R C, Brantley, P S, O'Brien, M J, and Childs, H H. Algorithmic Improvements for Portable EventBased Monte Carlo Transport Using the Nvidia Thrust Library. United States: N. p., 2016.
Web.
Bleile, R C, Brantley, P S, O'Brien, M J, & Childs, H H. Algorithmic Improvements for Portable EventBased Monte Carlo Transport Using the Nvidia Thrust Library. United States.
Bleile, R C, Brantley, P S, O'Brien, M J, and Childs, H H. 2016.
"Algorithmic Improvements for Portable EventBased Monte Carlo Transport Using the Nvidia Thrust Library". United States.
doi:. https://www.osti.gov/servlets/purl/1335794.
@article{osti_1335794,
title = {Algorithmic Improvements for Portable EventBased Monte Carlo Transport Using the Nvidia Thrust Library},
author = {Bleile, R C and Brantley, P S and O'Brien, M J and Childs, H H},
abstractNote = {},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month = 6
}
Other availability
Please see Document Availability for additional information on obtaining the fulltext document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.
Save to My Library
You must Sign In or Create an Account in order to save documents to your library.

Limits on the Efficiency of EventBased Algorithms for Monte Carlo Neutron Transport
The traditional form of parallelism in Monte Carlo particle transport simulations, wherein each individual particle history is considered a unit of work, does not lend itself well to datalevel parallelism. Eventbased algorithms, which were originally used for simulations on vector processors, may offer a path toward better utilizing datalevel parallelism in modern computer architectures. In this study, a simple model is developed for estimating the efficiency of the eventbased particle transport algorithm under two sets of assumptions. Data collected from simulations of four reactor problems using OpenMC was then used in conjunction with the models to calculate the speedup duemore » 
Experiences with different parallel programming paradigms for Monte Carlo particle transport leads to a portable toolkit for parallel Monte Carlo
Monte Carlo particle transport is easy to implement on massively parallel computers relative to other methods of transport simulation. This paper describes experiences of implementing a realistic demonstration Monte Carlo code on a variety of parallel architectures. Our pool of tasks'' technique, which allows reproducibility from run to run regardless of the number of processors, is discussed. We present detailed timing studies of simulations performed on the 128 processor BBNACI TC2000 and preliminary timing results for the 32 processor Kendall Square Research KSR1. Given sufficient workload to distribute across many computational nodes, the BBN achieves nearly linear speedup for amore » 
Experiences with different parallel programming paradigms for Monte Carlo particle transport leads to a portable toolkit for parallel Monte Carlo
Monte Carlo particle transport is easy to implement on massively parallel computers relative to other methods of transport simulation. This paper describes experiences of implementing a realistic demonstration Monte Carlo code on a variety of parallel architectures. Our ``pool of tasks`` technique, which allows reproducibility from run to run regardless of the number of processors, is discussed. We present detailed timing studies of simulations performed on the 128 processor BBNACI TC2000 and preliminary timing results for the 32 processor Kendall Square Research KSR1. Given sufficient workload to distribute across many computational nodes, the BBN achieves nearly linear speedup for amore » 