Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Evaluating the Performance of Integer Sum Reduction on an Intel GPU

Conference ·

Sum reduction is a primitive operation in parallel computing while SYCL is a promising heterogeneous programming language. In this paper, we describe the SYCL implementations of integer sum reduction using atomic functions, shared local memory, vectorized memory accesses, and parameterized workload sizes. Evaluating the reduction kernels shows that we can achieve 1.4X speedup over the open-source implementations of sum reduction for a sufficiently large number of integers on an Intel integrated GPU.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1840205
Country of Publication:
United States
Language:
English

References (17)

Performance Portability of Multi-Material Kernels conference November 2019
Evaluating the Performance of the hipSYCL Toolchain for HPC Kernels on NVIDIA V100 GPUs conference April 2020
From opencl to high-performance hardware on FPGAS conference August 2012
Performance Characterization and Optimization of Atomic Operations on AMD GPUs conference September 2011
OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems journal May 2010
Performance Portability of a Wilson Dslash Stencil Operator Mini-App Using Kokkos and SYCL conference November 2019
Debugging and Analyzing Programs Using the Intercept Layer for OpenCL Applications conference May 2018
Evaluation of Medical Imaging Applications using SYCL conference November 2019
A Case Study of k-means Clustering using SYCL conference December 2019
Evaluating the performance of HPC-style SYCL applications conference April 2020
A Case for Work-stealing on FPGAs with OpenCL Atomics conference February 2016
neoSYCL: a SYCL implementation for SX-Aurora TSUBASA conference January 2021
Porting a Legacy CUDA Stencil Code to oneAPI conference May 2020
Algorithmic strategies for optimizing the parallel reduction primitive in CUDA conference July 2012
Massive atomics for massive parallelism on GPUs journal June 2014
Efficiency and productivity for decision making on low-power heterogeneous CPU+GPU SoCs journal March 2020
Nuclear Reactor Simulation on OpenCL FPGA conference May 2018

Similar Records

Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs
Conference · Sun Aug 01 00:00:00 EDT 2021 · OSTI ID:1840191

Exploring Integer Sum Reduction using Atomics on Intel CPU
Conference · Mon May 13 00:00:00 EDT 2019 · OSTI ID:1515074

Population Count on IntelĀ® CPU, GPU, and FPGA
Conference · Tue Dec 31 23:00:00 EST 2019 · OSTI ID:1804082

Related Subjects