Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Adapting In Situ Accelerators for Sparsity With Granular Matrix Reordering

Journal Article · · IEEE Computer Architecture Letters
 [1];  [1];  [2];  [3]
  1. Univ. of Rochester, NY (United States)
  2. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  3. QUALCOMM Inc., San Diego, CA (United States)

Neural network (NN) inference is an essential part of modern systems and is found at the heart of numerous applications ranging from image recognition to natural language processing. In situ NN accelerators can efficiently perform NN inference using resistive crossbars, which makes them a promising solution to the data movement challenges faced by conventional architectures. Although such accelerators demonstrate significant potential for dense NNs, they often do not benefit from sparse NNs, which contain relatively few non-zero weights. Processing sparse NNs on in situ accelerators results in wasted energy to charge the entire crossbar where most elements are zeros. To address this limitation, this paper proposes Granular Matrix Reordering (GMR): a preprocessing technique that enables an energy-efficient computation of sparse NNs on in situ accelerators. GMR reorders the rows and columns of sparse weight matrices to maximize the crossbars' utilization and minimize the total number of crossbars needed to be charged. The reordering process does not rely on sparsity patterns and incurs no accuracy loss. Finally, GMR achieves an average of 28% and up to 34% reduction in energy consumption over seven pruned NNs across four different pruning methods and network architectures.

Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
AC04-94AL85000; NA0003525
OSTI ID:
1691454
Report Number(s):
SAND--2020-11301J; 691553
Journal Information:
IEEE Computer Architecture Letters, Journal Name: IEEE Computer Architecture Letters Journal Issue: 2 Vol. 19; ISSN 1556-6056
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English