Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Wootz: a compiler-based framework for fast CNN pruning via composability

Conference ·
Convolutional Neural Networks (CNN) are widely used for Deep Learning tasks. CNN pruning is an important method to adapt a large CNN model trained on general datasets to fit a more specialized task or a smaller device. The key challenge is on deciding which filters to remove in order to maximize the quality of the pruned networks while satisfying the constraints. It is time-consuming due to the enormous configuration space and the slowness of CNN training.The problem has drawn many efforts from the machine learning field, which try to reduce the set of network configurations to explore. This work tackles the problem distinctively from a programming systems perspective, trying to speed up the evaluations of the remaining configurations through computation reuse via a compiler-based framework. We empirically uncover the existence of composability in the training of a collection of pruned CNN models, and point out the opportunities for computation reuse. We then propose composability-based CNN pruning, and design a compression-based algorithm to efficiently identify the set of CNN layers to pre-train for maximizing their reuse benefits in CNN pruning. We further develop a compiler-based framework named Wootz, which, for an arbitrary CNN, automatically generates code that builds a Teacher-Student scheme to materialize composability-based pruning. Experiments show that network pruning enabled by Wootz shortens the state-of-art pruning process by up to 186X while producing significantly better pruning results.
Research Organization:
Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States)
Sponsoring Organization:
USDOE
DOE Contract Number:
AC05-00OR22725
OSTI ID:
1543204
Country of Publication:
United States
Language:
English

References (23)

A Configurable Cloud-Scale DNN Processor for Real-Time AI conference June 2018
Learning Efficient Convolutional Networks through Network Slimming conference October 2017
MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks conference June 2018
Motivation for Variable Length Intervals and Hierarchical Phase Behavior conference January 2005
Evaluating pruned object detection networks for real-time robot vision conference April 2018
Channel Pruning for Accelerating Very Deep Neural Networks conference October 2017
Scalpel journal June 2017
Deep Residual Learning for Image Recognition conference June 2016
Deep LDA-Pruned Nets for Efficient Facial Gender Classification conference July 2017
Designing Energy-Efficient Convolutional Neural Networks Using Energy-Aware Pruning conference July 2017
Whole program path-based dynamic impact analysis conference January 2003
Value-Based Deep-Learning Acceleration journal January 2018
DaDianNao: A Neural Network Supercomputer journal January 2017
Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition conference July 2017
Dynamic hot data stream prefetching for general-purpose programs journal May 2002
Going deeper with convolutions conference June 2015
Model compression
  • Buciluǎ, Cristian; Caruana, Rich; Niculescu-Mizil, Alexandru
  • Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '06 https://doi.org/10.1145/1150402.1150464
conference January 2006
ImageNet Large Scale Visual Recognition Challenge journal April 2015
Diversified Visual Attention Networks for Fine-Grained Object Classification journal June 2017
Efficient representations and abstractions for quantifying and exploiting data reference locality journal May 2001
3D Object Representations for Fine-Grained Categorization conference December 2013
Using compression algorithms to support the comprehension of program traces conference July 2010
Whole program paths journal May 1999

Similar Records

Composability-Centered Convolutional Neural Network Pruning
Technical Report · Wed Feb 14 23:00:00 EST 2018 · OSTI ID:1427608

A Novel Pruning Method for Convolutional Neural Networks Based off Identifying Critical Filters
Conference · Mon Jul 01 00:00:00 EDT 2019 · OSTI ID:1557493

Ps and Qs: Quantization-Aware Pruning for Efficient Low Latency Neural Network Inference
Journal Article · Thu Jul 08 20:00:00 EDT 2021 · Frontiers in Artificial Intelligence · OSTI ID:1824191

Related Subjects