Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Accelerating Transformer-based Deep Learning Models on FPGAs using Column Balanced Block Pruning

Conference ·
OSTI ID:1811281
 [1];  [1];  [2];  [2];  [3];  [4];  [4];  [1]
  1. University of Connecticut
  2. BATTELLE (PACIFIC NW LAB)
  3. University of Notre Dame
  4. Stevens Institute Of Technology
Although Transformer-based language representations achieve state-of-the-art accuracy on various natural language processing (NLP) tasks, the large model size has been challenging the resource constrained computing platforms. Weight pruning, as a popular and effective technique in reducing the number of weight parameters and accelerating the Transformer, has been investigated on GPUs. However, the Transformer acceleration using weight pruning on field-programmable gate array (FPGAs) remains unexplored. This paper investigates the column balanced block-wise pruning on Transformer and designs an FPGA acceleration engine to customize the balanced blockwise matrix multiplication. We implement the Transformer model with proper hardware scheduling, and the experiments show that the Transformer inference on FPGA achieves 10.35 ms latency with the batch size of 32, which is 10.96 × speed up comparing to CPU platform and 2.08 × speed up comparing to GPU platform.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1811281
Report Number(s):
PNNL-SA-159983
Country of Publication:
United States
Language:
English

Similar Records

Evaluating LULESH Kernels on OpenCL FPGA
Conference · Mon Dec 31 23:00:00 EST 2018 · OSTI ID:1528953

FPGAs as a Service to Accelerate Machine Learning Inference [PowerPoint]
Technical Report · Wed Mar 20 00:00:00 EDT 2019 · OSTI ID:1570210

FPGA-Accelerated Machine Learning Inference as a Service for Particle Physics Computing
Journal Article · Sun Oct 13 20:00:00 EDT 2019 · Computing and Software for Big Science · OSTI ID:1565955

Related Subjects