DOE Patents title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Runtime extension for neural network training with heterogeneous memory

Abstract

Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.

Inventors:
; ; ; ;
Issue Date:
Research Org.:
Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States); Advanced Micro Devices, Inc., Santa Clara, CA (United States)
Sponsoring Org.:
USDOE
OSTI Identifier:
2293673
Patent Number(s):
11775799
Application Number:
16/194,958
Assignee:
Advanced Micro Devices, Inc. (Santa Clara, CA)
Patent Classifications (CPCs):
G - PHYSICS G06 - COMPUTING G06F - ELECTRIC DIGITAL DATA PROCESSING
G - PHYSICS G06 - COMPUTING G06N - COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
DOE Contract Number:  
AC52-07NA27344; B620717
Resource Type:
Patent
Resource Relation:
Patent File Date: 11/19/2018
Country of Publication:
United States
Language:
English

Citation Formats

Mappouras, Georgios, Farmahini-Farahani, Amin, Gurumurthi, Sudhanva, Vishnu, Abhinav, and Loh, Gabriel H. Runtime extension for neural network training with heterogeneous memory. United States: N. p., 2023. Web.
Mappouras, Georgios, Farmahini-Farahani, Amin, Gurumurthi, Sudhanva, Vishnu, Abhinav, & Loh, Gabriel H. Runtime extension for neural network training with heterogeneous memory. United States.
Mappouras, Georgios, Farmahini-Farahani, Amin, Gurumurthi, Sudhanva, Vishnu, Abhinav, and Loh, Gabriel H. Tue . "Runtime extension for neural network training with heterogeneous memory". United States. https://www.osti.gov/servlets/purl/2293673.
@article{osti_2293673,
title = {Runtime extension for neural network training with heterogeneous memory},
author = {Mappouras, Georgios and Farmahini-Farahani, Amin and Gurumurthi, Sudhanva and Vishnu, Abhinav and Loh, Gabriel H.},
abstractNote = {Systems, apparatuses, and methods for managing buffers in a neural network implementation with heterogeneous memory are disclosed. A system includes a neural network coupled to a first memory and a second memory. The first memory is a relatively low-capacity, high-bandwidth memory while the second memory is a relatively high-capacity, low-bandwidth memory. During a forward propagation pass of the neural network, a run-time manager monitors the usage of the buffers for the various layers of the neural network. During a backward propagation pass of the neural network, the run-time manager determines how to move the buffers between the first and second memories based on the monitored buffer usage during the forward propagation pass. As a result, the run-time manager is able to reduce memory access latency for the layers of the neural network during the backward propagation pass.},
doi = {},
journal = {},
number = ,
volume = ,
place = {United States},
year = {2023},
month = {10}
}

Works referenced in this record:

EIE: Efficient Inference Engine on Compressed Deep Neural Network
conference, June 2016


Texture features for biometric authentication
patent, February 2013


moDNN: Memory optimal DNN training on GPUs
conference, March 2018


Deep Neural Network Processor with Interleaved Backpropagation
patent-application, May 2019


Low Overhead Message Passing for High Performance Many-Core Processors
conference, December 2013


Reconfigurable hardware accelerator for boolean satisfiability solver
patent, March 2012


Machine Learning Inference Engine Scalability
patent-application, October 2019


Optimal Tiling Strategy for Memory Bandwidth Reduction for CNNs
book, January 2017


Efficient FPGA Acceleration of Convolutional Neural Networks Using Logical-3D Compute Array
conference, January 2016


Auto Generation and Tuning Tool for Convolution Kernels
patent-application, September 2020


Variable rate vocoder
patent, August 1997


Methods and systems for reduced complexity nonlinear compensation
patent, April 2016


Profile-guided proactive garbage collection for locality optimization
journal, June 2006


Memory Bandwidth Reduction Techniques for Low Power Convolutional Neural Network Inference Applications
patent-application, May 2019


Low Latency Long Short-Term Memory Inference with Sequence Interleaving
patent-application, April 2020


Tiling format for convolutional neural networks
patent, September 2020


Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks
conference, February 2018


System and method for improved general object detection using neural networks
patent, September 2018


Method for Calculating an Output of a Neural Network
patent-application, August 2019


vDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design
conference, October 2016


F-C3D: FPGA-based 3-dimensional convolutional neural network
conference, September 2017


Dynamic Acceleration of Data Processor Operations using Data-Flow Analysis
patent-application, October 2019