Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference

Journal Article · · IEEE Transactions on Parallel and Distributed Systems
 [1];  [2];  [1];  [1];  [3];  [4];  [5];  [1]
  1. Boston University
  2. BATTELLE (PACIFIC NW LAB)
  3. Zhejiang University
  4. University of Hong Kong
  5. Los Alamos National Laboratory
Binarized Neural Networks (BNN) have drawn tremendous attention due to significantly reduced computational complexity and memory demand. They have especially shown great potential in cost- and power-restricted domains, such as IoT and smart edge-devices, where reaching a certain accuracy bar is often sufficient, and real-time is highly desired.In this work, we demonstrate that the highly-condensed BNN model can be shrunk significantly further by dynamically pruning irregular redundant edges. Based on two new observations on BNN-specific properties, an out-of-order (OoO) architecture – O3BNN-R, can curtail edge evaluation in cases where the binary output of a neuron can be determined early. Similar to Instruction-Level-Parallelism(ILP), these fine-grained, irregular, runtime pruning opportunities are traditionally presumed to be difficult to exploit. In order to increase the pruning opportunities, we also optimize the training process by adding 2 regularization items in the loss function (1) for pooling pruning and (2) for threshold pruning. We evaluate our design on an FPGA platform using three well-known networks, including VggNet-16, AlexNet for ImageNet, and a VGG-like network for Cifar-10.
Research Organization:
Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-76RL01830
OSTI ID:
1670985
Report Number(s):
PNNL-SA-148318
Journal Information:
IEEE Transactions on Parallel and Distributed Systems, Journal Name: IEEE Transactions on Parallel and Distributed Systems Journal Issue: 1 Vol. 32
Country of Publication:
United States
Language:
English

Similar Records

O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning
Conference · Wed Aug 14 00:00:00 EDT 2019 · OSTI ID:1764982

LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism
Conference · Thu Sep 05 00:00:00 EDT 2019 · OSTI ID:1765112

BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets
Conference · Sat Nov 16 23:00:00 EST 2019 · OSTI ID:1580517

Related Subjects