GPU Profiling and Optimizing xRAGE (Final Report)
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
Our project’s objective is to increase the efficiency of GPU-enabled kernels in xRAGE. To do so, we conduct GPU profiling with NSight Systems on xRAGE tests unsplit_sod_1d and unsplit_sedov_2d to identify bottlenecks and understand the behavior of the GPU during code execution. Next, we analyze these generated GPU profiles to locate the lines of code whose optimization have the most potential for improving runtime. We replicate the structure of the code in smaller test problems that are easier to understand, edit, and run quickly. Within these test problems, we implement two different methods of improving performance: transformation of nested loops into a single MDRangePolicy and hierarchical parallelization using teams of threads. Both methods show speedups in the test code, and after transferring them to xRAGE, they both show up to 30x speedups on various computing platforms. Profiling the edited versions of xRAGE reveals that the GPU successfully executed the bottlenecks with greater efficiency
- Research Organization:
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- USDOE National Nuclear Security Administration (NNSA)
- DOE Contract Number:
- 89233218CNA000001
- OSTI ID:
- 1996148
- Report Number(s):
- LA-UR--23-29613
- Country of Publication:
- United States
- Language:
- English
Similar Records
Large-Scale Multi-Dimensional Document Clustering on GPU Clusters
Performance Analysis of PIConGPU: Particle-in-Cell on GPUs using NVIDIA’s NSight Systems and NSight Compute