Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library

Bleile, Ryan C.; Brantley, Patrick S.; O'Brien, Matthew J.; Childs, Hank

Title: Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library

Journal Article · Fri Jul 01 00:00:00 EDT 2016 · Transactions of the American Nuclear Society

OSTI ID:23042651

Bleile, Ryan C. ^[1]; Brantley, Patrick S.; O'Brien, Matthew J. ^[1]; Childs, Hank ^[2]

Lawrence Livermore National Laboratory, P.O. Box 808, Livermore, CA 94551 (United States)
Department of Computer and Information Science, University of Oregon, Eugene, OR 97403 (United States)

High performance computing environments are progressively moving towards many-core computing architectures. The Los Alamos National Laboratory Trinity machine, available in late 2016, will use both Intel Xeon Haswell processors and Intel Xeon Phi Knights Landing many integrated core (MIC) coprocessors. The Lawrence Livermore National Laboratory Sierra machine, available in 2018, will use an IBM PowerPC architecture along with Nvidia graphics processing units (GPUs). Applications that must work in this supercomputing environment must continue to adapt in order to take advantage of the diverse hardware architectures that are coming. A significant consideration is not only the performance of the application on a given platform but also the portability of the application to other platforms. The algorithmic improvements presented in this paper build upon recently-reported work on event-based Monte Carlo transport in the ALPSMC code that models particle transport in one-dimensional binary stochastic media. That paper discussed the lack of available vectorization in the traditional history-based algorithm used for Monte Carlo transport and presented a data parallel event-based algorithm implemented using the Nvidia Thrust library for portability. The performance of the data parallel event-based algorithm implemented using Thrust was compared to a native CUDA implementation. The conclusions from that work were that the Thrust library abstraction technique caused too significant a loss in performance but that the event-based method was a viable option that should be further investigated. In this paper, we describe algorithmic improvements to the data parallel event-based algorithm previously presented. We made further algorithmic optimizations to the event-based CUDA implementation, most notably: data structure changes, a new conditional particle removal scheme in the event-based process, and the use of multiple GPUs. In addition to improvements to the algorithm, we re-implemented the Thrust version from the now further optimized CUDA version, giving a greater chance for success at a performant abstraction. Finally, we revisited our previous assumptions about the inability of the history-based method to achieve performance on vector style architectures such as the MICs and GPUs, with surprising and promising results. (authors)

Cite

Export

Save

OSTI ID:: 23042651

Journal Information:: Transactions of the American Nuclear Society, Vol. 115; Conference: 2016 ANS Winter Meeting and Nuclear Technology Expo, Las Vegas, NV (United States), 6-10 Nov 2016; Other Information: Country of input: France; 7 refs.; available from American Nuclear Society - ANS, 555 North Kensington Avenue, La Grange Park, IL 60526 (US); ISSN 0003-018X

Country of Publication:: United States

Language:: English

Similar Records

Investigation of Portable Event-Based Monte Carlo Transport Using the NVIDIA Thrust Library

Journal Article · Wed Jun 15 00:00:00 EDT 2016 · Transactions of the American Nuclear Society · OSTI ID:23042651

Bleile, Ryan C.; Brantley, Patrick S.; Dawson, Shawn A.; +2 more

Case Study of Using Kokkos and SYCLs Performance-Portable Frameworks for Milc-Dslash Benchmark on NVIDIA, AMD and Intel GPUs

Conference · Fri Jan 01 00:00:00 EST 2021 · OSTI ID:23042651

Dufek, Amanda S; Gayatri, Rahulkumar; Mehta, Neil A; +4 more

Graphics processing unit accelerated phase field dislocation dynamics: Application to bi-metallic interfaces

Journal Article · Sat Oct 14 00:00:00 EDT 2017 · Advances in Engineering Software · OSTI ID:23042651

Eghtesad, Adnan; Germaschewski, Kai; Beyerlein, Irene J.; +2 more

Related Subjects

97 MATHEMATICAL METHODS AND COMPUTING
73 NUCLEAR PHYSICS AND RADIATION PHYSICS
ALGORITHMS
COMPARATIVE EVALUATIONS
COMPUTER ARCHITECTURE
MONTE CARLO METHOD
ONE-DIMENSIONAL CALCULATIONS
OPTIMIZATION
PARTICLE MODELS
PERFORMANCE
STOCHASTIC PROCESSES
VECTORS

Title: Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library

Citation Formats

Similar Records

Related Subjects