Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs

Conference ·

SYCL is a promising programming model for heterogeneous computing—allowing a single-source code to target devices from multiple vendors. One significant task performed on these accelerators is a primitive operation for integer sum reduction. This paper presents several SYCL implementations of integer sum reduction—using atomic functions, shared local memory, vectorized memory accesses and parameterized workload sizes—to compare the performance and maturity of SYCL against open-source vendor-specific implementations of the same reduction. For a sufficiently large number of integers, tuning the parameters of our SYCL implementations achieves 1.4X speedup over the open-source implementations on an Intel UHD630 integrated GPU. The SYCL reduction is 3% faster than the templated reduction in Thrust, and 0.3% faster than the device reduction in CUB on an Nvidia P100 GPU. The SYCL reduction is 1.9% faster than the templated reduction in Thrust, and 0.4% faster than the device reduction in CUB on an Nvidia V100 GPU.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1840191
Country of Publication:
United States
Language:
English

References (11)

Performance Portability of Multi-Material Kernels conference November 2019
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems journal May 2010
On the Portability of CPU-Accelerated Applications via Automated Source-to-Source Translation conference January 2019
Debugging and Analyzing Programs Using the Intercept Layer for OpenCL Applications conference May 2018
Parallel Computing Experiences with CUDA journal July 2008
Evaluating the Performance and Portability of Contemporary SYCL Implementations conference November 2020
Swan: A tool for porting CUDA programs to OpenCL journal April 2011
Evaluating the performance of HPC-style SYCL applications conference April 2020
CUDA-on-CL conference May 2017
Thrust: A Productivity-Oriented Library for CUDA book January 2012
Massive atomics for massive parallelism on GPUs journal June 2014

Similar Records

Evaluating the Performance of Integer Sum Reduction on an Intel GPU
Conference · Tue Jun 01 00:00:00 EDT 2021 · OSTI ID:1840205

Experience of Migrating a Parallel Graph Coloring Program from CUDA to SYCL
Technical Report · Fri Apr 01 00:00:00 EDT 2022 · OSTI ID:1864412

Performance portability study of epistasis detection using SYCL on NVIDIA GPU
Conference · Mon Aug 01 00:00:00 EDT 2022 · OSTI ID:1883813

Related Subjects