In this work, we demonstrate SONOS (silicon-oxide-nitrideoxide- silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-to-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a >10× gain in energy efficiency over state-of-the-art digital and analog inference accelerators.
Xiao, Tianyao Patrick, et al. "An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory." IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 69, no. 4, Jan. 2022. https://doi.org/10.1109/tcsi.2021.3134313
Xiao, Tianyao Patrick, Feinberg, Benjamin, Bennett, Christopher H., Agrawal, Vineet, Saxena, Prashant, Prabhakar, Venkatraman, Ramkumar, Krishnaswamy, Medu, Harsha, Raghavan, Vijay, Chettuvetty, Ramesh, Agarwal, Sapan, & Marinella, Matthew J. (2022). An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory. IEEE Transactions on Circuits and Systems I: Regular Papers, 69(4). https://doi.org/10.1109/tcsi.2021.3134313
Xiao, Tianyao Patrick, Feinberg, Benjamin, Bennett, Christopher H., et al., "An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory," IEEE Transactions on Circuits and Systems I: Regular Papers 69, no. 4 (2022), https://doi.org/10.1109/tcsi.2021.3134313
@article{osti_1842837,
author = {Xiao, Tianyao Patrick and Feinberg, Benjamin and Bennett, Christopher H. and Agrawal, Vineet and Saxena, Prashant and Prabhakar, Venkatraman and Ramkumar, Krishnaswamy and Medu, Harsha and Raghavan, Vijay and Chettuvetty, Ramesh and others},
title = {An Accurate, Error-Tolerant, and Energy-Efficient Neural Network Inference Engine Based on SONOS Analog Memory},
annote = {In this work, we demonstrate SONOS (silicon-oxide-nitrideoxide- silicon) analog memory arrays that are optimized for neural network inference. The devices are fabricated in a 40nm process and operated in the subthreshold regime for in-memory matrix multiplication. Subthreshold operation enables low conductances to be implemented with low error, which matches the typical weight distribution of neural networks, which is heavily skewed toward near-zero values. This leads to high accuracy in the presence of programming errors and process variations. We simulate the end-to-end neural network inference accuracy, accounting for the measured programming error, read noise, and retention loss in a fabricated SONOS array. Evaluated on the ImageNet dataset using ResNet50, the accuracy using a SONOS system is within 2.16% of floating-point accuracy without any retraining. The unique error properties and high On/Off ratio of the SONOS device allow scaling to large arrays without bit slicing, and enable an inference architecture that achieves 20 TOPS/W on ResNet50, a >10× gain in energy efficiency over state-of-the-art digital and analog inference accelerators.},
doi = {10.1109/tcsi.2021.3134313},
url = {https://www.osti.gov/biblio/1842837},
journal = {IEEE Transactions on Circuits and Systems I: Regular Papers},
issn = {ISSN 1549-8328},
number = {4},
volume = {69},
place = {United States},
publisher = {IEEE},
year = {2022},
month = {01}}
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Organization:
USDOE National Nuclear Security Administration (NNSA); USDOE Laboratory Directed Research and Development (LDRD) Program; Defense Threat Reduction Agency (DTRA)
Grant/Contract Number:
NA0003525; HDTRA1-17-1-0038
OSTI ID:
1842837
Report Number(s):
SAND-2022-0047J; 703075
Journal Information:
IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 69, Issue 4; ISSN 1549-8328
Ankit, Aayush; Hajj, Izzat El; Chalamalasetti, Sai Rahul
ASPLOS '19: Architectural Support for Programming Languages and Operating Systems, Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systemshttps://doi.org/10.1145/3297858.3304049
2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), 2009 IEEE Conference on Computer Vision and Pattern Recognitionhttps://doi.org/10.1109/CVPR.2009.5206848