Stochastic gradient method for training of a class of recurrent neural nets
A recurrent neural net is defined by set I of nodes, set of input nodes J {contained_in} I, set of output nodes, K {improper_subset} I, set of oriented arcs A {improper_subset} I {times} I. Each node i {element_of} I is characterized by state z{sub i} and function f{sup i} x, z{sub i+} where x is the vector of the network parameters and z{sub i+} is the vector of states of input nodes to node i, i.e. such nodes from which start arcs which point to node i. At the beginning values z{sub i}{sup 0} are assigned to states of all inputs nodes i {element_of} J and the net starts to function in discrete time s = 0, 1, ..., by changing the states as follows: z{sub i}{sup 8+1} = f{sup i}(x, z{sub i+}{sup s}). To each output node j {element_of} K the reference values y{sub j} are assigned. The objective is to train the network, i.e. to select the values x of the network measures the difference between reference values and states of the output nodes is minimized: min{sub x}F(x, z) = {sub j{element_of}K}{sup {Sigma}} {phi}(y{sub j} - z{sub j}). The principle difficulty compared with simple feedforward networks is the presence of cycles which lead to a nontrivial transient behavior of the net. In this talk we use stochastic gradient ideas in order to construct analogue of backpropagation techniques which permits to train the network in real time, i.e. changing the vector x each moment of discrete time without waiting that the net reaches the steady state. We prove the convergence of proposed techniques.
- OSTI ID:
- 36043
- Report Number(s):
- CONF-9408161-; TRN: 94:009753-0312
- Resource Relation:
- Conference: 15. international symposium on mathematical programming, Ann Arbor, MI (United States), 15-19 Aug 1994; Other Information: PBD: 1994; Related Information: Is Part Of Mathematical programming: State of the art 1994; Birge, J.R.; Murty, K.G. [eds.]; PB: 312 p.
- Country of Publication:
- United States
- Language:
- English
Similar Records
Generalized information-lossless automata of finite order. II
Fusion rule estimation using vector space methods