skip to main content
OSTI.GOV title logo U.S. Department of Energy
Office of Scientific and Technical Information

Title: Overcoming the Static Learning Bottleneck ? The Need for Adaptive Neural Learning.


Abstract not provided.

Publication Date:
Research Org.:
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
Sponsoring Org.:
USDOE National Nuclear Security Administration (NNSA)
OSTI Identifier:
Report Number(s):
DOE Contract Number:
Resource Type:
Resource Relation:
Conference: Proposed for presentation at the IEEE International Conference on Rebooting Computing held October 17-19, 2016 in San Diego, CA.
Country of Publication:
United States

Citation Formats

Vineyard, Craig Michael. Overcoming the Static Learning Bottleneck ? The Need for Adaptive Neural Learning.. United States: N. p., 2016. Web. doi:10.1109/ICRC.2016.7738692.
Vineyard, Craig Michael. Overcoming the Static Learning Bottleneck ? The Need for Adaptive Neural Learning.. United States. doi:10.1109/ICRC.2016.7738692.
Vineyard, Craig Michael. 2016. "Overcoming the Static Learning Bottleneck ? The Need for Adaptive Neural Learning.". United States. doi:10.1109/ICRC.2016.7738692.
title = {Overcoming the Static Learning Bottleneck ? The Need for Adaptive Neural Learning.},
author = {Vineyard, Craig Michael},
abstractNote = {Abstract not provided.},
doi = {10.1109/ICRC.2016.7738692},
journal = {},
number = ,
volume = ,
place = {United States},
year = 2016,
month =

Other availability
Please see Document Availability for additional information on obtaining the full-text document. Library patrons may search WorldCat to identify libraries that hold this conference proceeding.

Save / Share:
  • Abstract not provided.
  • The advent of massively parallel computers allows new algorithms to be devised for test pattern generation which promise to reduce run time significantly. The connection machine, currently under development at the MIT Artificial Intelligence laboratory, is described as a vehicle for the implementation of CMTPG, a test pattern generation program. The algorithm employed allows all possible test patterns for a given stuck fault in a combinational logic network to be derived in time which grows linearly with the number of gates in the network and logarithmically with the size of the connection machine message routing network. Proof of the validitymore » of the algorithm is given, and suggestions for further research are made. 6 references.« less
  • Deep Learning (DL) algorithms have become the {\em de facto} Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive -- even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults -- requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: {\em What is needed from MPI for designing fault tolerant DL implementations?} In this paper, we address this problem for permanent faults. We motivate the need for amore » fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by extending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet neural network topology demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM.« less
  • The backpropagation learning algorithm has proven to be a robust method for training feedforward multilayer neural networks to map the relationships between input/output patterns. However, as with many gradient descent optimization methods, the rate of convergence of the backpropagation algorithm decreases the closer it gets to the solution, and it requires judicious selection of the learning and momentum constants to achieve reasonable convergence and avoid oscillations about the optimum solution. In this paper, the discussion focuses on how the method of conjugate gradients can be combined with the backpropagation algorithm to improve and accelerate learning in neural networks and eliminatemore » the process of selecting parameters. The proposed method was used to train a neural network to classify nuclear power plant transients, and it significantly expedited the learning process. 5 refs., 1 fig.« less
  • This paper presents the development of a pair of recursive least squares (RLS) algorithms for online training of multilayer perceptrons, which are a class of feedforward artificial neural networks. These algorithms incorporate second order information about the training error surface in order to achieve faster learning rates than are possible using first order gradient descent algorithms such as the generalized delta rule. A least squares formulation is derived from a linearization of the training error function. Individual training pattern errors are linearized about the network parameters that were in effect when the pattern was presented. This permits the recursive solutionmore » of the least squares approximation, either via conventional RLS recursions or by recursive QR decomposition-based techniques. The computational complexity of the update is in the order of (N{sup 2}), where N is the number of network parameters. This is due to the estimation of the N {times} N inverse Hessian matrix. Less computationally intensive approximations of the RLS algorithms can be easily derived by using only block diagonal elements of this matrix, thereby partitioning the learning into independent sets. A simulation example is presented in which a neural network is trained to approximate a two dimensional Gaussian bump. In this example, RLS training required an order of magnitude fewer iterations on average (527) than did training with the generalized delta rule (6331). 14 refs., 3 figs.« less