
- Advances in Neural Information Processing Systems 8, pp. 10381044, MIT Press, 1996. Generalization in Reinforcement
- To appear in the Adaptive Behavior 6:2. Experiments with Reinforcement Learning in Problems
- Appeared in Proceedings of the Seventh Int. Conf. on Machine Learning, pp. 216224, Morgan Kaufmann, 1990. Integrated Architectures for Learning, Planning, and Reacting
- Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction
- 10.1101/lm.1942210Access the most recent version at doi: 2010 17: 600-604Learn. Mem.
- iCORE ANNUAL REPORT 2010 iCORE CPE GRANT CPE45
- Artificial Intelligence as a Control Problem: Comments on the Relationship between Machine Learning and Intelligent Control
- SUTTON, Richard PIN: 278214 p. 6 1 Introduction and objectives
- Scaling Reinforcement Learning toward RoboCup Soccer Peter Stone pstone@research.att.com
- Temporal Abstraction in Temporal-difference Networks
- Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in
- Open Theoretical Questions in Reinforcement Learning
- Application of Connectionist Learning Methods to Manl~farturingProcess Monitoring
- Off-Policy Temporal-Difference Learning with Function Approximation
- Temporal-Difference Networks with History Brian Tanner and Richard S. Sutton
- Temporal-Difference Networks Richard S. Sutton and Brian Tanner
- Using Predictive Representations to Improve Generalization in Reinforcement Learning
- A. 1. Samuel Abstract: Two machine-learning procedures have beeninvestigated in some detail using the game of
- Machine Learning, 8, 341-362 (1992) 1992 Kluwer Academic Publishers, Boston. Manufacturedin The Netherlands.
- Toward Off-Policy Learning Control with Function Approximation Hamid Reza Maei,
- The Grand Challenge of Predictive Empirical Abstract Knowledge
- Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation
- A Convergent O(n) Algorithm for Off-policy Temporal-difference Learning
- A computational model of hippocampal function in trace conditioning
- Incremental Natural Actor-Critic Algorithms Shalabh Bhatnagar
- On the Role of Tracking in Stationary Environments Richard S. Sutton sutton@cs.ualberta.ca
- Reinforcement Learning of Local Shape in the Game of Go David Silver, Richard Sutton, and Martin Muller
- iLSTD: Eligibility Traces and Convergence Analysis Alborz Geramifard Michael Bowling Martin Zinkevich
- Incremental Least-Squares Temporal Difference Learning Alborz Geramifard Michael Bowling Richard S. Sutton
- TD() Networks: Temporal-Difference Networks with Eligibility Traces
- Reinforcement Learning for RoboCup Soccer Peter Stone1
- Technical Report 98-74, Dept. of Computer Science, University of Massachusetts, Amherst, MA 01003. April, 1998. Between MDPs and Semi-MDPs
- Improved Switching among Temporally Abstract Actions
- Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces
- On the Significance of Markov Decision Processes
- Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
- Appeared in Proceedings of the Tenth Int. Conf. on Machine Learning, pp. 314321, Morgan Kaufmann, 1993. Online Learning with Random Representations
- Appeared in Proceedings of the Tenth National Conf. on Artificial Intelligence, pp. 171176, MIT Press, 1992. Adapting Bias by Gradient Descent
- Reinforcement Learning Architectures Richard S. Sutton
- Approximately as appeared in: Learning and Computational Neuroscience: Foundations of Adaptive Networks, M. Gabriel and J. Moore, Eds., pp. 497537. MIT Press, 1990.
- Machine Learning 3:9 44, 1988 @ 1988 Kluwer Academic Publishers, Boston Manufactured in The Netherlands
- Biol.Cybern.43, 175-185(I982) Biological Cybernetics
- Biol.Cybern.40,201-211(1981) Biological Cybernetics
- Biol.Cybern.42, 1-8 (1981) Biological Cybernetics
- GQ(): A general gradient algorithm for temporal-difference prediction learning with eligibility traces
- Least-Squares Temporal Di erence Learning Justin A. Boyan
- Magnitude and Timing of Conditioned Responses in Delay and Trace Classical Conditioning of the Nictitating Membrane Response
- ARTICLE IN PRESS Automatica ( )
- Natural ActorCritic Algorithms Shalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, and Mark Lee
- Sample-Based Learning and Search with Permanent and Transient David Silver silver@cs.ualberta.ca
- Keepaway Soccer: A Machine Learning Testbed Peter Stone and Richard S. Sutton
- In Machine Learning: ECML'98. 10th European conference on Machine Learning, Chemnitz, Germany, April 1998. Proceedings, pp.382393. Springer Verlag.
- Hierarchical Optimal Control of MDPs Amy McGovern
- Advances in Neural Information Processing Systems 8, pp. 1038-1044, MIT Press, 1996. Generalization in Reinforcement
- Model-Based Reinforcement Learning with an Approximate, Learned Model
- GTE Laboratories Technical Note TN87-509.1 May 1987 Corrected Aug 1989 Implementation Details of the TD( ) Procedure
- Machine Learning 3: 9{44, 1988 c 1988 Kluwer Academic Publishers, Boston { Manufactured in The Netherlands
- Scalar Timing Varies With Response Magnitude in Classical Conditioning of the Nictitating Membrane Response of the Rabbit
- Policy Gradient Methods for Reinforcement Learning with Function
- STEPS TOWARD ARTIFICIAL INTELLIGENCE Marvin Minsky
- Magnitude and Timing of Nictitating Membrane Movements During Classical Conditioning of the Rabbit (Oryctolagus cuniculus)
- Intra-Option Learning about Temporally Abstract Actions Richard S. Sutton
- Towards Self-Learning Adaptive Scheduling for ATM R. K. Mehra, B. Ravichandran, J. B. D. Cabrera, D. N. Greve and R. S. Sutton
- MachineLearning,8, 225-227(1992) 1992KluwerAcademicPublishers,Boston.Manufacturedin The Netherlands.
- Appeared in Y.C. Lee (ed.) Evolution, Learning and Cognition, 391403, 1988 World Scientific SELECTED BIBLIOGRAPHY ON CONNECTIONISM
- GTE TR88-509.4 NADALINE: A Normalized Adaptive Linear Element that Learns Efficiently
- Learning a Nonlinear Model of a Manufacturing Process Using Multilayer Connectionist Networks
- dimension 1 query point
- Reinforcement Learning Richard S. Sutton
- Appeared in Proceedings of the Seventh Yale Workshop on Adaptive and Learning Systems, pp. 161166, 1992. Gain Adaptation Beats Least Squares?
- LETTER Communicated by Peter Dayan Stimulus Representation and the Timing of Reward-Prediction
- University of Massachusetts, Amherst Technical Report Number 98-70 1 Macro-Actions in Reinforcement Learning: An Empirical
- Comparing Policy-Gradient Algorithms Richard S. Sutton sutton@research.att.com
- Machine Learning, 8, 257-277 (1992) 1992Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
- Advances in Reinforcement Learning and Their Implications for Intelligent Control
- Machine Learning, 8, 293-321 (1992) 1992 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
- Machine Learning, 8, 229-256 (1992) 1992Kluwer Academic Publishers, Boston. Manufacturedin The Netherlands.
- Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games
- iCORE Research Grant Renewal Proposal Reinforcement Learning and Artificial Intelligence
- Reinforcement Learning: Past, Present and Future?
- GQ() Quick Reference Guide Adam White and Richard S. Sutton
- Online Human Training of a Myoelectric Prosthesis Controller via Actor-Critic Reinforcement Learning
- Eligibility Traces for Off-Policy Policy Evaluation Doina Precup DPRECUP@CS.UMASS.EDU
- This article was written by Rich Sutton and published as part of a large technical report prepared by him and by Andrew Barto in 1981. It should be cited as follows
- Introduction The idea that we learn by interacting with our environment is probably the
- Multi-timescale Nexting in a Reinforcement Learning Robot
- University of Alberta Gradient Temporal-Difference Learning Algorithms
- DOI 10.1007/s10994-012-5280-0 Temporal-difference search in computer Go
- TUNING-FREE STEP-SIZE ADAPTATION Ashique Rupam Mahmood Richard S. Sutton Thomas Degris Patrick M. Pilarski