Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Evaluating CUDA Portability with HIPCL and DPCT

Conference ·

HIPCL is expanding the scope of the CUDA portability route from an AMD platform to an OpenCL platform. In the meantime, the Intel DPC++ Compatibility Tool (DPCT) is migrating a CUDA program to a data parallel C++ (DPC++) program. Towards the goal of portability enhancement, we evaluate the performance of the CUDA applications from Rodinia, SHOC, and proxy applications ported using HIPCL and DPCT on Intel GPUs. After profiling the ported programs, we aim to understand their performance gaps, and optimize codes converted by DPCT to improve their performance. The open-source repository for the CUDA, HIP, and DPCT programs will be useful for the development of a translator.

Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE; USDOE Office of Science (SC)
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1838992
Country of Publication:
United States
Language:
English

References (29)

Data Parallel C++: Mastering DPC++ for Programming of Heterogeneous Systems using C++ and SYCL book November 2020
NVIDIA cuda software and gpu parallel computing architecture conference January 2007
Performance Portability of Multi-Material Kernels conference November 2019
From opencl to high-performance hardware on FPGAS conference August 2012
Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator journal December 2013
Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN conference May 2019
Performance Portability of a Wilson Dslash Stencil Operator Mini-App Using Kokkos and SYCL conference November 2019
Scalable molecular dynamics on CPU and GPU architectures with NAMD journal July 2020
On the Portability of CPU-Accelerated Applications via Automated Source-to-Source Translation conference January 2019
Examining recent many-core architectures and programming models using SHOC
  • Lopez, M. Graham; Young, Jeffrey; Meredith, Jeremy S.
  • Proceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems https://doi.org/10.1145/2832087.2832090
conference November 2015
Debugging and Analyzing Programs Using the Intercept Layer for OpenCL Applications conference May 2018
Evaluation of Medical Imaging Applications using SYCL conference November 2019
Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consumption conference January 2017
A Comprehensive Performance Comparison of CUDA and OpenCL conference September 2011
Ginkgo: A high performance numerical linear algebra library journal August 2020
A Case Study of k-means Clustering using SYCL conference December 2019
Swan: A tool for porting CUDA programs to OpenCL journal April 2011
Evaluating the performance of HPC-style SYCL applications conference April 2020
Rodinia: A benchmark suite for heterogeneous computing conference October 2009
The Scalable Heterogeneous Computing (SHOC) benchmark suite
  • Danalis, Anthony; Marin, Gabriel; McCurdy, Collin
  • Proceedings of the 3rd Workshop on General-Purpose Computation on Graphics Processing Units - GPGPU '10 https://doi.org/10.1145/1735688.1735702
conference January 2010
CUDA-on-CL conference May 2017
Hipcl conference April 2020
neoSYCL: a SYCL implementation for SX-Aurora TSUBASA conference January 2021
Porting a Legacy CUDA Stencil Code to oneAPI conference May 2020
Performance Characterisation and Simulation of Intel's Integrated GPU Architecture conference April 2018
Accelerated Machine Learning Using TensorFlow and SYCL on OpenCL Devices conference May 2017
Evaluating and Optimizing OpenCL Kernels for High Performance Computing with FPGAs
  • Zohouri, Hamid Reza; Maruyama, Naoya; Smith, Aaron
  • SC16: International Conference for High Performance Computing, Networking, Storage and Analysis https://doi.org/10.1109/SC.2016.34
conference November 2016
Efficiency and productivity for decision making on low-power heterogeneous CPU+GPU SoCs journal March 2020
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading conference January 2017

Similar Records

HIPLZ: Enabling performance portability for exascale systems
Journal Article · Mon Jul 17 00:00:00 EDT 2023 · Concurrency and Computation. Practice and Experience · OSTI ID:2279004

Case Study of Using Kokkos and SYCLs Performance-Portable Frameworks for Milc-Dslash Benchmark on NVIDIA, AMD and Intel GPUs
Conference · Thu Dec 31 23:00:00 EST 2020 · OSTI ID:1892057

Related Subjects