skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL

Abstract

The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes the FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The reportmore » presents the experimental results in details. The Appendix lists the kernel source code.« less

Authors:
 [1];  [1];  [1];  [1]
  1. Argonne National Lab. (ANL), Argonne, IL (United States)
Publication Date:
Research Org.:
Argonne National Lab. (ANL), Argonne, IL (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
1357909
Report Number(s):
ANL/ALCF-17/3
135284
DOE Contract Number:
AC02-06CH11357
Resource Type:
Technical Report
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; OpenCL; FPGA; AES

Citation Formats

Jin, Zheming, Yoshii, Kazutomo, Finkel, Hal, and Cappello, Franck. Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL. United States: N. p., 2017. Web. doi:10.2172/1357909.
Jin, Zheming, Yoshii, Kazutomo, Finkel, Hal, & Cappello, Franck. Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL. United States. doi:10.2172/1357909.
Jin, Zheming, Yoshii, Kazutomo, Finkel, Hal, and Cappello, Franck. 2017. "Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL". United States. doi:10.2172/1357909. https://www.osti.gov/servlets/purl/1357909.
@article{osti_1357909,
title = {Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL},
author = {Jin, Zheming and Yoshii, Kazutomo and Finkel, Hal and Cappello, Franck},
abstractNote = {The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes the FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.},
doi = {10.2172/1357909},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2017,
month = 4
}

Technical Report:

Save / Share:
  • Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication andmore » kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.« less
  • The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes themore » FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark application.« less
  • With the recent availability of commercial parallel computers, researchers are examining new classes of problems for benefits from parallel processing. This report presents results of an investigation of the set of problems classified as search-intensive. The specific problems discussed in this report are the backtracking search method of the N-queens problem and the Least-Cost Branch and Bound search of deadline job scheduling. The object-oriented design methodology was used to map the problem into a parallel solution. While the initial design was good for a prototype, the best performance resulted from fine tuning the algorithms for a specific computer. The experimentsmore » of the N-queens and deadline job scheduling included an analysis of the computation time to first solution, the computation time to all solutions, the speedup over a VAX 11/785, and the load balance of the problem when using an Intel Personal SuperComputer(IPSC). The IPSC is a loosely couple multiprocessor system based on a hypercube architecture. Results are presented that compare the performance of the IPSC and VAX 11/785 for these classes of problems.« less
  • The Intel 2102-1 500-ns RAM is a single monolithic integrated circuit fabricated with N-channel Si-gate technology. The results of transient radiation tests using electron beams from the NRL Linac facility indicate that the device cannot be used in an application where the total dose is greater than 2.0 x 10,000 rad (Si). Further, allowance for a device recovery time of 25 ns should be made if the device is to be exposed to a pulsed radiation environment (60 ns) with an applied dose of between several hundred rads up to approximately 2 x 10,000 rads. (GRA)
  • Time-dependent Ginzburg-Landau (TDGL) equations are considered for modeling a thin-film finite size superconductor placed under magnetic field. The problem then leads to the use of so-called natural boundary conditions. Computational domain is partitioned into subdomains and bond variables are used in obtaining the corresponding discrete system of equations. An efficient time-differencing method based on the Forward Euler method is developed. Finally, a variable strength magnetic field resulting in a vortex motion in Type II High {Tc} superconducting films is introduced. The authors tackled the problem using two different state-of-the-art parallel computing tools: BlockComm/Chameleon and PCN. They had access to twomore » high-performance distributed memory supercomputers: the Intel iPSC/860 and IBM SP1. They also tested the codes using, as a parallel computing environment, a cluster of Sun Sparc workstations.« less