Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Evaluating Nonuniform Reduction in HIP and SYCL on GPUs

Conference ·

Motivated by maturing programming models and portability for heterogeneous computing, we describe the challenges posed by hardware architectures and programming models when migrating an optimized implementation of nonuniform reduction from CUDA to HIP and SYCL. We explain the migration experience, evaluate the performance of the reduction on GPU -based computing platforms, and provide feedback on improving portability for the development of the SYCL programming model.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1996715
Country of Publication:
United States
Language:
English

References (15)

Fast BVH Construction on GPUs journal April 2009
Evaluating the Performance of the hipSYCL Toolchain for HPC Kernels on NVIDIA V100 GPUs conference April 2020
Achieving Exascale Capabilities through Heterogeneous Computing journal July 2015
Experiences Porting NAMD to the Data Parallel C++ Programming Model conference May 2022
Toward exascale whole-device modeling of fusion devices: Porting the GENE gyrokinetic microturbulence code to GPU journal June 2021
A Fast Hybrid Approach for Stream Compaction on GPUs conference November 2016
Efficient stream compaction on wide SIMD many-core architectures conference August 2009
GPU‐based Collision Detection for Deformable Parameterized Surfaces journal September 2006
Parallel Computing Experiences with CUDA journal July 2008
Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level conference February 2018
A Comparison of SYCL, OpenCL, CUDA, and OpenMP for Massively Parallel Support Vector Machine Classification on Multi-Vendor Hardware conference May 2022
LLVM: A compilation framework for lifelong program analysis & transformation conference January 2004
Thrust: A Productivity-Oriented Library for CUDA book January 2012
Evaluating Performance and Portability of a core bioinformatics kernel on multiple vendor GPUs conference November 2021
Data parallel algorithms journal December 1986

Related Subjects