Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

A Case Study with the HACCmk Kernel in SYCL

Technical Report ·
DOI:https://doi.org/10.2172/1576562· OSTI ID:1576562
 [1]
  1. Argonne National Lab. (ANL), Argonne, IL (United States)

The SYCL standard specifies a cross-platform abstraction layer that enables programming of heterogeneous computing systems using standard C++ [1]. In the Open Computing Language (OpenCL) programming model, host and device codes are written in different languages [2 ]. To improve development productivity and performance portability, the SYCL programming model combines host and device codes for an application in a type-safe way. In this report, we are interested in applying the SYCL programming model to a computationally intensive routine derived from the Hardware Accelerated Cosmology Code (HACC) framework for a study on performance portability on a heterogeneous computing device. We use Intel® OneAPI toolkit [3] to build the OpenCL and SYCL programs, and evaluate the performance of both implementations on Intel® integrated GPUs. We find that the SYCL implementation can achieve the same performance as the OpenCL implementation after we specify a target backend for the compiler to build the SYCL program in offline compilation. Without the specification of a target backend, the runtime will have to compile intermediate device codes for a target platform. Such runtime overhead may become significant compared to the execution of a kernel on a target device. As the kernel routine is compute-bound, we evaluate the impact of the number of compute units upon the kernel’s raw performance using two GPUs. When the hardware resource is not underutilized, we can obtain almost linear performance speedup from 48 compute units to 72 compute units in offline compilation, which indicates that the number of compute units on a GPU are important to improving the raw performance of a compute-bound kernel. The experimental results show that SYCL is a promising programming model for heterogeneous computing. The remainder of the report is organized as follows: In Section II, we describe the SYCL programming model, the steps to map an OpenCL program to a SYCL program, and the architecture of an integrated GPU. Section III introduces the kernel routine, and presents the OpenCL and SYCL implementations of the kernel. In Section IV, we evaluate the performance of the implementations on the GPUs. Section V concludes the report.

Research Organization:
Argonne National Laboratory (ANL), Argonne, IL (United States)
Sponsoring Organization:
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) (SC-21)
DOE Contract Number:
AC02-06CH11357
OSTI ID:
1576562
Report Number(s):
ANL/ALCF--19/5; 157540
Country of Publication:
United States
Language:
English

Similar Records

A Case Study on the HACCmk Routine in SYCL on Integrated Graphics
Conference · Tue Dec 31 23:00:00 EST 2019 · OSTI ID:1801627

Improving the performance of medical imaging applications using SYCL
Technical Report · Tue May 05 00:00:00 EDT 2020 · OSTI ID:1630290

The Rodinia Benchmark Suite in SYCL
Technical Report · Mon Jun 01 00:00:00 EDT 2020 · OSTI ID:1631460

Related Subjects