Accelerating x-ray tracing for exascale systems using Kokkos
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Los Alamos National Laboratory (LANL), Los Alamos, NM (United States)
- Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
The upcoming exascale computing systems Frontier and Aurora will draw much of their computing power from GPU accelerators. The hardware for these systems will be provided by AMD and Intel, respectively, each supporting their own GPU programming model. The challenge for applications that harness one of these exascale systems will be to avoid lock-in and to preserve performance portability. We report here on our results of using Kokkos to accelerate a real-world application on NERSC's Perlmutter Phase 1 (using NVIDIA A100 accelerators) and Crusher, the testbed system for OLCF's Frontier (using AMD MI250X). By porting to Kokkos, we successfully ran the same X-ray tracing code on both systems and achieved speed-ups between 13 % and 66 % compared to the original CUDA code. Finally, these results are a highly encouraging demonstration of using Kokkos to accelerate production science code.
- Research Organization:
- Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States)
- Sponsoring Organization:
- USDOE Office of Science (SC), Basic Energy Sciences (BES). Scientific User Facilities (SUF); USDOE National Nuclear Security Administration (NNSA)
- Grant/Contract Number:
- NA0003525; AC02-05CH11231; AC05-00OR22725
- OSTI ID:
- 2500817
- Journal Information:
- Concurrency and Computation. Practice and Experience, Journal Name: Concurrency and Computation. Practice and Experience Journal Issue: 5 Vol. 36; ISSN 1532-0626
- Publisher:
- WileyCopyright Statement
- Country of Publication:
- United States
- Language:
- English
Similar Records
Evaluating performance and portability of high-level programming models: Julia, Python/Numba, and Kokkos on exascale nodes
Studying Performance Portability of LAMMPS across Diverse GPU-based Platforms