Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network

  Advanced Search  

Q-Learning with Hidden-Unit Restarting Charles W. Anderson

Summary: Q-Learning with Hidden-Unit Restarting
Charles W. Anderson
Department of Computer Science
Colorado State University
Fort Collins, CO 80523
Platt's resource-allocation network RAN Platt, 1991a, 1991b
is modi ed for a reinforcement-learning paradigm and to restart"
existing hidden units rather than adding new units. After restart-
ing, units continue to learn via back-propagation. The resulting
restart algorithm is tested in a Q-learning network that learns to
solve an inverted pendulum problem. Solutions are found faster on
average with the restart algorithm than without it.
1 Introduction
The goal of supervised learning is the discovery of a compact representation that
generalizes well. Such representations are typically found by incremental, gradient-
based search, such as error back-propagation. However, in the early stages of learn-
ing a control task, we are more concerned with fast learning than a compact rep-
resentation. This implies a local representation with the extreme being the mem-
orization of each experience. An initially local representation is also advantageous


Source: Anderson, Charles W. - Department of Computer Science, Colorado State University


Collections: Computer Technologies and Information Sciences