Home
About
Advanced Search
Browse by Discipline
Scientific Societies
E-print Alerts
Add E-prints
FAQ
•
HELP
•
SITE MAP
•
CONTACT US
Search
Advanced Search
Williams, Ronald J. - College of Computer Science, Northeastern University
Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions \Lambda
Some Observations on the Use of the Extended Kalman Filter as a
Reinforcement Learning Algorithms as Function Optimizers \Lambda Ronald J. Williams and Jing Peng
Analysis of Some Incremental Variants of Policy Iteration: First Steps Toward Understanding
A Learning Algorithm for Continually Running Fully Recurrent Neural Networks
Robust, Efficient, GloballyOptimized Reinforcement Learning with the
A MATHEMATICAL ANALYSIS OF ACTORCRITIC ARCHITECTURES FOR LEARNING OPTIMAL CONTROLS
An Efficient GradientBased Algorithm for OnLine Training of Recurrent Network Trajectories \Lambda
Function Optimization Using Connectionist Reinforcement Learning Algorithms \Lambda
Modifying the PartiGame Algorithm for Increased Robustness, Higher Efficiency and Better Policies
, , 1--8 () fl Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
An Approach to Using RuleLike Training Data in Connectionist Networks \Lambda
GradientBased Learning Algorithms for Recurrent Networks and Their Computational Complexity
Incremental MultiStep QLearning College of Computer Science
Adaptive State Representation and Estimation Using Recurrent Connectionist Networks \Lambda
Training Recurrent Networks Using the Extended Kalman Filter \Lambda
Simple Statistical GradientFollowing Algorithms for Connectionist Reinforcement Learning
GradientBased Learning Algorithms for Recurrent Connectionist Networks \Lambda
Temporal Difference Learning: A Chemical Process Control Application
Modifying the PartiGame Algorithm for Increased Robustness, Higher
Efficient Learning and Planning Within the Dyna Jing Peng and Ronald J. Williams
Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions