O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning

Geng, Tong; Wang, Tianqi; Wu, Chunshu; Yang, Chen; Wu, Wei; Li, Ang; Herbordt, Martin

doi:10.1145/3330345.3330386

O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning

Conference · Wed Aug 14 04:00:00 EDT 2019

DOI:https://doi.org/10.1145/3330345.3330386· OSTI ID:1764982

Geng, Tong ^[1]; Wang, Tianqi ^[1]; Wu, Chunshu ^[1]; Yang, Chen ^[1]; Wu, Wei ^[2]; Li, Ang ^[3]; Herbordt, Martin ^[1]

Boston University
Los Alamos National Laboratory
BATTELLE (PACIFIC NW LAB)

In this work, we demonstrate that the highly-condensed BNN model can be shrunk significantly further by dynamically pruning irregular redundant edges. Based on two new observations on BNN-specific properties, our out-of-order (OoO) architecture – O3BNN, can curtail remaining edge evaluation in cases where the binary output of a neuron can be determined early. Similar to Instruction-Level-Parallelism (ILP), these fine-grained, irregular, runtime pruning opportunities are traditionally presumed to be di cult to exploit. We evaluate our design on an FPGA platform using three well-known networks, including VggNet-16, AlexNet for ImageNet, and a VGG-like network for Cifar-10. Results show that our out-of-order approach can prune 27%, 16%, and 42% of the operations for the three networks respectively, without any accuracy loss, leading to at least 1.7×, 1.5×, and 2.1× speedups over state-of-the-art implementations on FPGA/GPU/CPU BNN implementations. Our approach is inference runtime pruning, so no retrain or fine-tuning is needed. We demonstrate our design on an FPGA platform. However, this is only for showcasing the method; the approach does not rely on any FPGA-specific features and can thus be adopted by other devices as well.

🛈

OSTI does not have a digital full text copy available. For more information, please see document availability, search WorldCat, or search Google Scholar.

Research Organization:: Pacific Northwest National Laboratory (PNNL), Richland, WA (United States)

Sponsoring Organization:: USDOE

DOE Contract Number:: AC05-76RL01830

OSTI ID:: 1764982

Report Number(s):: PNNL-SA-141065

Country of Publication:: United States

Language:: English

Similar Records

O3BNN-R: An Out-Of-Order Architecture for High-Performance and Regularized BNN Inference

Journal Article · Thu Dec 31 23:00:00 EST 2020 · IEEE Transactions on Parallel and Distributed Systems · OSTI ID:1670985

BSTC: A Novel Binarized-Soft-Tensor-Core Design for Accelerating Bit-Based Approximated Neural Nets

Conference · Sat Nov 16 23:00:00 EST 2019 · OSTI ID:1580517

LP-BNN: Ultra-low-Latency BNN Inference with Layer Parallelism

Conference · Thu Sep 05 00:00:00 EDT 2019 · OSTI ID:1765112

O3BNN: An Out-Of-Order Architecture for High-Performance Binarized Neural Network Inference with Fine-Grained Pruning

Citation Formats

Similar Records

Related Subjects