Skip to main content
U.S. Department of Energy
Office of Scientific and Technical Information

Ill-conditioning in neural network training problems

Journal Article · · SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (United States)
DOI:https://doi.org/10.1137/0914044· OSTI ID:6492379
; ;  [1]
  1. Univ. of Illinois, Urbana (United States)

The training problem for feedforward neural networks is nonlinear parameter estimation that can be solved by a variety of optimization techniques. Much of the literature on neural networks has focused on variants of gradient descent. The training of neural networks using such techniques is known to be a slow process with more sophisticated techniques not always performing significantly better. This paper shows that feedforward neural networks can have ill-conditioned Hessians and that this ill-conditioning can be quite common. The analysis and experimental results in this paper lead to the conclusion that many network training problems are ill conditioned and may not be solved more efficiently by higher-order optimization methods. While the analyses used in this paper are for completely connected layered networks, they extend to networks with sparse connectivity as well. The results suggest that neural networks can have considerable redundancy in parameterizing the function space in a neighborhood of a local minimum, independently of whether or not the solution has a small residual.

DOE Contract Number:
FG02-85ER25001
OSTI ID:
6492379
Journal Information:
SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (United States), Journal Name: SIAM Journal on Scientific and Statistical Computing (Society for Industrial and Applied Mathematics); (United States) Vol. 14:3; ISSN SIJCD4; ISSN 0196-5204
Country of Publication:
United States
Language:
English

Similar Records

A robust and efficient training algorithm for feedforward neural networks
Thesis/Dissertation · Mon Dec 31 23:00:00 EST 1990 · OSTI ID:7033768

Reducing Communication in Graph Neural Network Training
Conference · Sun Nov 01 00:00:00 EDT 2020 · SC '20: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis · OSTI ID:1647608

Statistical and optimization methods to expedite neural network training for transient identification
Conference · Sun Feb 28 23:00:00 EST 1993 · OSTI ID:10147434