Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory

Journal Article · · IEEE Transactions on Circuits and Systems I: Regular Papers
In this work, we demonstrate SONOS (silicon-oxide-nitrideoxide- silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-to-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a >10× gain in energy efficiency over state-of-the-art digital and analog inference accelerators.
Research Organization:
Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
Defense Threat Reduction Agency (DTRA); USDOE Laboratory Directed Research and Development (LDRD) Program; USDOE National Nuclear Security Administration (NNSA)
Grant/Contract Number:
NA0003525
OSTI ID:
1842837
Report Number(s):
SAND--2022-0047J; 703075
Journal Information:
IEEE Transactions on Circuits and Systems I: Regular Papers, Journal Name: IEEE Transactions on Circuits and Systems I: Regular Papers Journal Issue: 4 Vol. 69; ISSN 1549-8328
Publisher:
IEEECopyright Statement
Country of Publication:
United States
Language:
English

References (54)

Physics of Semiconductor Devices book January 2007
The missing memristor found journal May 2008
Accurate deep neural network inference using computational phase-change memory journal May 2020
Memory devices and applications for in-memory computing journal March 2020
Fully hardware-implemented memristor convolutional neural network journal January 2020
Analogue signal and image processing with large memristor crossbars journal December 2017
Analog architectures for neural network acceleration based on non-volatile memory journal September 2020
Resistive switching memories based on metal oxides: mechanisms, reliability and scaling journal May 2016
Analysis and mitigation of parasitic resistance effects for analog in-memory neural network acceleration journal October 2021
A unified model for the flicker noise in metal-oxide-semiconductor field-effect transistors journal March 1990
New second generation current conveyor with reduced parasitic resistance and bandpass filter application journal June 2001
Improved on-chip router analytical power and area modeling conference January 2010
Analog in-memory subthreshold deep neural network accelerator conference April 2017
ImageNet: A large-scale hierarchical image database
  • Deng, Jia; Dong, Wei; Socher, Richard
  • 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognition https://doi.org/10.1109/CVPR.2009.5206848
conference June 2009
Deep Residual Learning for Image Recognition conference June 2016
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference conference June 2018
Hanguang 800 NPU – The Ultimate AI Inference Solution for Data Centers conference August 2020
Data retention in MLC NAND flash memory: Characterization, optimization, and recovery conference February 2015
Memristive Boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning conference March 2016
Fast, energy-efficient, robust, and reproducible mixed-signal neuromorphic classifier based on embedded NOR flash memory technology conference December 2017
Reducing the Impact of Phase-Change Memory Conductance Drift on the Inference of large-scale Hardware Neural Networks conference December 2019
40 nm Ultralow-Power Charge-Trap Embedded NVM Technology for IoT Applications conference May 2018
Device-aware inference operations in SONOS nonvolatile memory arrays conference April 2020
Optimized programming algorithms for multilevel RRAM in hardware neural networks conference March 2021
MLPerf Inference Benchmark conference May 2020
Timely: Pushing Data Movements And Interfaces In Pim Accelerators Towards Local And In Time Domain conference May 2020
A Highly Dense, Low Power, Programmable Analog Vector-Matrix Multiplier: The FPAA Implementation journal September 2011
Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator journal March 2018
Efficient Processing of Deep Neural Networks: A Tutorial and Survey journal December 2017
A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC With Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS journal December 2013
A Ferroelectric FET-Based Processing-in-Memory Architecture for DNN Acceleration journal December 2019
Compute-in-Memory Chips for Deep Learning: Recent Trends and Prospects journal October 2021
Serving DNNs in Real Time at Datacenter Scale with Project Brainwave journal March 2018
Newton: Gravitating Towards the Physical Limits of Crossbar Acceleration journal September 2018
Memristive devices and systems journal January 1976
Study of Charge Loss Mechanism of SONOS-Type Devices using Hot-Hole Erase and Methods to Improve the Charge Retention conference January 2006
An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT) journal October 2019
Performance and Reliability Features of Advanced Nonvolatile Memories Based on Discrete Traps (Silicon Nanocrystals, SONOS) journal September 2004
Three-Dimensional nand Flash for Vector–Matrix Multiplication journal April 2019
Efficient Mixed-Signal Neurocomputing Via Successive Integration and Rescaling journal March 2020
Access devices for 3D crosspoint memory
  • Burr, Geoffrey W.; Shenoy, Rohit S.; Virwani, Kumar
  • Journal of Vacuum Science & Technology B, Nanotechnology and Microelectronics: Materials, Processing, Measurement, and Phenomena, Vol. 32, Issue 4 https://doi.org/10.1116/1.4889999
journal July 2014
Explicit modeling of control and data for improved NoC router estimation conference January 2012
Dot-product engine for neuromorphic computing: programming 1T1M crossbar to accelerate matrix-vector multiplication conference January 2016
ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars journal June 2016
PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory journal October 2016
Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks journal October 2016
In-Datacenter Performance Analysis of a Tensor Processing Unit conference January 2017
CACTI 7: New Tools for Interconnect Exploration in Innovative Off-Chip Memories
  • Balasubramonian, Rajeev; Kahng, Andrew B.; Muralimanohar, Naveen
  • ACM Transactions on Architecture and Code Optimization, Vol. 14, Issue 2 https://doi.org/10.1145/3085572
journal July 2017
Xilinx Adaptive Compute Acceleration Platform conference February 2019
PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference
  • Ankit, Aayush; Hajj, Izzat El; Chalamalasetti, Sai Rahul
  • ASPLOS '19: Architectural Support for Programming Languages and Operating Systems, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems https://doi.org/10.1145/3297858.3304049
conference April 2019
Sparse ReRAM engine conference June 2019
Noise Injection Adaption conference June 2019
A domain-specific supercomputer for training deep neural networks journal June 2020
CxDNN: Hardware-software Compensation Methods for Deep Neural Networks on Resistive Crossbar Systems journal January 2020

Similar Records

Ionizing radiation effects in SONOS-based neuromorphic inference accelerators
Journal Article · Wed Feb 10 19:00:00 EST 2021 · IEEE Transactions on Nuclear Science · OSTI ID:1770799

Analysis and mitigation of parasitic resistance effects for analog in-memory neural network acceleration
Journal Article · Wed Oct 13 20:00:00 EDT 2021 · Semiconductor Science and Technology · OSTI ID:1828786

Low Power, Radiation Resilient Synchronous Edge Processing for Remote Monitoring
Technical Report · Fri Nov 01 00:00:00 EDT 2024 · OSTI ID:2480159