DOE PAGES title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks

Abstract

Recently, a Cambrian explosion of novel, non-volatile memory (NVM) devices known as memristive devices have inspired effort in building hardware neural networks that learn like the brain. Early experimental prototypes built simple perceptrons from nanosynapses, and recently, fully-connected multi-layer perceptron (MLP) learning systems have been realized. However, while backpropagating learning systems pair well with high-precision computer memories and achieve state-of-the-art performances, this typically comes with a massive energy budget. For future Internet of Things/peripheral use cases, system energy footprint will be a major constraint, and emerging NVM devices may fill the gap by sacrificing high bit precision for lower energy. In this work, we contrast the well known MLP approach with the Extreme Learning Machine (ELM) or NoProp approach, which uses a large layer of random weights to improve the separability of high-dimensional tasks, and is usually considered inferior in a software context. However, we find that when taking device non-linearity into account, NoProp manages to equal hardware MLP system in terms of accuracy. While also using a sign-based adaptation of the delta rule for energy-savings, we find that NoProp can learn effectively with four to six ’bits’ of device analog capacity, while MLP requires eight bit capacity with themore » same rule. This may allow the requirements for memristive devices to be relaxed in the context of online learning. By comparing the energy footprint of these systems for several candidate nanosynapses, as well as realistic peripherals, we confirm that memristive NoProp systems save energy compared to MLP systems. Lastly, we show that ELM/NoProp systems can achieve better generalization abilities than nanosynaptic MLP systems when paired with pre-processing layers (which do not require backpropagated error). Collectively, these advantages make such systems worthy of consideration in future accelerators or embedded hardware.« less

Authors:
 [1];  [2];  [3];  [3];  [2];  [4];  [3]
  1. Univ. Paris-Sud, Universite Paris-Saclay (France); Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
  2. Indian Institute of Technology Delhi, New Delhi (India)
  3. Univ. Paris-Sud, Universite Paris-Saclay (France)
  4. Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
1526218
Report Number(s):
SAND-2019-5935J
Journal ID: ISSN 2169-3536; 675857
Grant/Contract Number:  
AC04-94AL85000; NA0003525
Resource Type:
Accepted Manuscript
Journal Name:
IEEE Access
Additional Journal Information:
Journal Volume: 7; Journal ID: ISSN 2169-3536
Publisher:
IEEE
Country of Publication:
United States
Language:
English
Subject:
97 MATHEMATICS AND COMPUTING; hardware neural networks; memristive devices; online learning; edge computing

Citation Formats

Bennett, Christopher H., Parmar, Vivek, Calvet, Laurie E., Klein, Jacques-Olivier, Suri, Manan, Marinella, Matthew J., and Querlioz, Damien. Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks. United States: N. p., 2019. Web. doi:10.1109/ACCESS.2019.2920076.
Bennett, Christopher H., Parmar, Vivek, Calvet, Laurie E., Klein, Jacques-Olivier, Suri, Manan, Marinella, Matthew J., & Querlioz, Damien. Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks. United States. https://doi.org/10.1109/ACCESS.2019.2920076
Bennett, Christopher H., Parmar, Vivek, Calvet, Laurie E., Klein, Jacques-Olivier, Suri, Manan, Marinella, Matthew J., and Querlioz, Damien. Thu . "Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks". United States. https://doi.org/10.1109/ACCESS.2019.2920076. https://www.osti.gov/servlets/purl/1526218.
@article{osti_1526218,
title = {Contrasting advantages of learning with random weights and backpropagation in non-volatile memory neural networks},
author = {Bennett, Christopher H. and Parmar, Vivek and Calvet, Laurie E. and Klein, Jacques-Olivier and Suri, Manan and Marinella, Matthew J. and Querlioz, Damien},
abstractNote = {Recently, a Cambrian explosion of novel, non-volatile memory (NVM) devices known as memristive devices have inspired effort in building hardware neural networks that learn like the brain. Early experimental prototypes built simple perceptrons from nanosynapses, and recently, fully-connected multi-layer perceptron (MLP) learning systems have been realized. However, while backpropagating learning systems pair well with high-precision computer memories and achieve state-of-the-art performances, this typically comes with a massive energy budget. For future Internet of Things/peripheral use cases, system energy footprint will be a major constraint, and emerging NVM devices may fill the gap by sacrificing high bit precision for lower energy. In this work, we contrast the well known MLP approach with the Extreme Learning Machine (ELM) or NoProp approach, which uses a large layer of random weights to improve the separability of high-dimensional tasks, and is usually considered inferior in a software context. However, we find that when taking device non-linearity into account, NoProp manages to equal hardware MLP system in terms of accuracy. While also using a sign-based adaptation of the delta rule for energy-savings, we find that NoProp can learn effectively with four to six ’bits’ of device analog capacity, while MLP requires eight bit capacity with the same rule. This may allow the requirements for memristive devices to be relaxed in the context of online learning. By comparing the energy footprint of these systems for several candidate nanosynapses, as well as realistic peripherals, we confirm that memristive NoProp systems save energy compared to MLP systems. Lastly, we show that ELM/NoProp systems can achieve better generalization abilities than nanosynaptic MLP systems when paired with pre-processing layers (which do not require backpropagated error). Collectively, these advantages make such systems worthy of consideration in future accelerators or embedded hardware.},
doi = {10.1109/ACCESS.2019.2920076},
journal = {IEEE Access},
number = ,
volume = 7,
place = {United States},
year = {Thu May 30 00:00:00 EDT 2019},
month = {Thu May 30 00:00:00 EDT 2019}
}

Journal Article:
Free Publicly Available Full Text
Publisher's Version of Record

Citation Metrics:
Cited by: 7 works
Citation information provided by
Web of Science

Figures / Tables:

FIGURE 1 FIGURE 1: (a) and (b) show jump tables for device evolution starting at $G$on/$Gmax$ and $G$off /$Gmin$ conductance respectively using the linear model (Eqn. 1); (c) and (d) depict the same but for the non-linear case (Eqn. 2) where $ΔG$ is now modulated by the device's state relative to itsmore » extrema.« less

Save / Share:

Works referencing / citing this record:

Voltage control of domain walls in magnetic nanowires for energy-efficient neuromorphic devices
journal, January 2020

  • Azam, Md Ali; Bhattacharya, Dhritiman; Querlioz, Damien
  • Nanotechnology, Vol. 31, Issue 14
  • DOI: 10.1088/1361-6528/ab6234