 
Summary: IEEE TRANSACTIONS ON NEURAL NETWORKS. VOL. 2, NO. 4. JULY 1991 461
Here, E,, E, I etc. means over all neurons in layer n, n  1, etc.,
6,*(k)for any later 1 5 k 5 n  1 is given by
S,*(k) = w$(k)f'(net$(k)).
IV. CONCLUSION
A learning algorithm based on dynamic programming has been
derived for multilayer neural networks. The advantage of this al
gorithm over other wellknown algorithms [4],[SI is that it pro
vides a recursive relationship to compute a minimizing error func
tion for every hidden layer expressed explicitly in terms of the
weights and outputs of the hidden layer. The algorithm can be used
even when neuron activation functions are not continuous.
REFERENCES
[l] R. E. Bellman, and S. E. Dreyfus, Applied Dynamic Programming.
Princeton, NJ: Princeton University Press, 1962.
[2] L. Cooper and M. Cooper, Introduction to Dynamic Programming.
Elmsford, NY: Pergamon, 1981.
[3] R. E. Larson and J. L. Casti, Principles of Dynamic Programming.
New York: Marcel Decker, 1978.
[4] D. E. Rumelhart, et al., "Learning representations by back propagat
