Summary: IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL. 2, NO. 4. JULY 1991 461
Here, E,, E,- I etc. means over all neurons in layer n, n - 1, etc.,
6,*(k)for any later 1 5 k 5 n - 1 is given by
S,*(k) = w$(k)f'(net$(k)).
A learning algorithm based on dynamic programming has been
derived for multilayer neural networks. The advantage of this al-
gorithm over other well-known algorithms ,[SI is that it pro-
vides a recursive relationship to compute a minimizing error func-
tion for every hidden layer expressed explicitly in terms of the
weights and outputs of the hidden layer. The algorithm can be used
even when neuron activation functions are not continuous.
[l] R. E. Bellman, and S. E. Dreyfus, Applied Dynamic Programming.
Princeton, NJ: Princeton University Press, 1962.
 L. Cooper and M. Cooper, Introduction to Dynamic Programming.
Elmsford, NY: Pergamon, 1981.
 R. E. Larson and J. L. Casti, Principles of Dynamic Programming.
New York: Marcel Decker, 1978.
 D. E. Rumelhart, et al., "Learning representations by back propagat-