skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

Journal Article · · IEEE Transactions on Circuits and Systems I: Regular Papers

In this work, we demonstrate SONOS (silicon-oxide-nitrideoxide- silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-to-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a >10× gain in energy efficiency over state-of-the-art digital and analog inference accelerators.

Research Organization:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program; Defense Threat Reduction Agency (DTRA)
Grant/Contract Number:
NA0003525; HDTRA1-17-1-0038
OSTI ID:
1842837
Report Number(s):
SAND-2022-0047J; 703075
Journal Information:
IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 69, Issue 4; ISSN 1549-8328
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (52)

Memristive devices and systems journal January 1976
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference conference June 2018
Sparse ReRAM engine conference June 2019
PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
  • Ankit, Aayush; Hajj, Izzat El; Chalamalasetti, Sai Rahul
  • ASPLOS '19: Architectural Support for Programming Languages and Operating Systems, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/3297858.3304049
conference April 2019
Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration journal September 2018
A Ferroelectric FET-Based Processing-in-Memory Architecture for DNN Acceleration journal December 2019
Timely: Pushing Data Movements And Interfaces In Pim Accelerators Towards Local And In Time Domain conference May 2020
Efficient Mixed-Signal Neurocomputing Via Successive Integration and Rescaling journal March 2020
A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC With Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS journal December 2013
Analog in-memory subthreshold deep neural network accelerator conference April 2017
CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories
  • Balasubramonian, Rajeev; Kahng, Andrew B.; Muralimanohar, Naveen
  • ACM Transactions on Architecture and Code Optimization, Vol. 14, Issue 2 https://doi.org/10.1145/3085572
journal July 2017
Three-Dimensional nand Flash for Vector–Matrix Multiplication journal April 2019
Improved on-chip router analytical power and area modeling conference January 2010
Explicit modeling of control and data for improved NoC router estimation conference January 2012
An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT) journal October 2019
Hanguang 800 NPU – The Ultimate AI Inference Solution for Data Centers conference August 2020
Serving DNNs in Real Time at Datacenter Scale with Project Brainwave journal March 2018
Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks journal October 2016
Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication conference January 2016
Efficient Processing of Deep Neural Networks: A Tutorial and Survey journal December 2017
Fully hardware-implemented memristor convolutional neural network journal January 2020
Analogue signal and image processing with large memristor crossbars journal December 2017
Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology conference December 2017
Reducing the Impact of Phase-Change Memory Conductance Drift on the Inference of large-scale Hardware Neural Networks conference December 2019
New second generation current conveyor with reduced parasitic resistance and bandpass filter application journal June 2001
Performance and Reliability Features of Advanced Nonvolatile Memories Based on Discrete Traps (Silicon Nanocrystals, SONOS) journal September 2004
Analysis and mitigation of parasitic resistance effects for analog in-memory neural network acceleration journal October 2021
Access devices for 3D crosspoint memory
  • Burr, Geoffrey W.; Shenoy, Rohit S.; Virwani, Kumar
  • Journal of Vacuum Science & Technology B, Nanotechnology and Microelectronics: Materials, Processing, Measurement, and Phenomena, Vol. 32, Issue 4 https://doi.org/10.1116/1.4889999
journal July 2014
Device-aware inference operations in SONOS nonvolatile memory arrays conference April 2020
Optimized programming algorithms for multilevel RRAM in hardware neural networks conference March 2021
Analog architectures for neural network acceleration based on non-volatile memory journal September 2020
The missing memristor found journal May 2008
ImageNet: A large-scale hierarchical image database
  • Deng, Jia; Dong, Wei; Socher, Richard
  • 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognition https://doi.org/10.1109/CVPR.2009.5206848
conference June 2009
Accurate deep neural network inference using computational phase-change memory journal May 2020
Noise Injection Adaption conference June 2019
CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems journal January 2020
40 nm Ultralow-Power Charge-Trap Embedded NVM Technology for IoT Applications conference May 2018
Deep Residual Learning for Image Recognition conference June 2016
MLPerf Inference Benchmark conference May 2020
In-Datacenter Performance Analysis of a Tensor Processing Unit conference January 2017
Xilinx Adaptive Compute Acceleration Platform conference February 2019
Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning conference March 2016
A domain-specific supercomputer for training deep neural networks journal June 2020
PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory journal October 2016
A unified model for the flicker noise in metal-oxide-semiconductor field-effect transistors journal March 1990
ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars journal June 2016
Memory devices and applications for in-memory computing journal March 2020
Study of Charge Loss Mechanism of SONOS-Type Devices using Hot-Hole Erase and Methods to Improve the Charge Retention conference January 2006
Data retention in MLC NAND flash memory: Characterization, optimization, and recovery conference February 2015
Physics of Semiconductor Devices book January 2007
Resistive switching memories based on metal oxides: mechanisms, reliability and scaling journal May 2016
A Highly Dense, Low Power, Programmable Analog Vector-Matrix Multiplier: The FPAA Implementation journal September 2011

Similar Records

Ionizing radiation effects in SONOS-based neuromorphic inference accelerators
Journal Article · Thu Feb 11 00:00:00 EST 2021 · IEEE Transactions on Nuclear Science · OSTI ID:1842837

Using Floating-Gate Memory to Train Ideal Accuracy Neural Networks
Journal Article · Sat Jun 01 00:00:00 EDT 2019 · IEEE Journal on Exploratory Solid-State Computational Devices and Circuits · OSTI ID:1842837

Analysis and mitigation of parasitic resistance effects for analog in-memory neural network acceleration
Journal Article · Thu Oct 14 00:00:00 EDT 2021 · Semiconductor Science and Technology · OSTI ID:1842837