Home
About
Advanced Search
Browse by Discipline
Scientific Societies
E-print Alerts
Add E-prints
FAQ
•
HELP
•
SITE MAP
•
CONTACT US
Search
Advanced Search
Yu, Huizhen Janey - Department of Computer Science, University of Helsinki
Distributed Asynchronous Policy Iteration in Dynamic Programming Dimitri P. Bertsekas and Huizhen Yu
April 2010 (Revised October 2010) Report LIDS -2831 Q-Learning and Enhanced Policy Iteration in
Convergence of Least Squares Temporal Difference Methods Under General Conditions
Convergence of Least Squares Temporal Difference Methods Under General Conditions
An Efficient Method for Large Margin Parameter Optimization in Structured Prediction Problems
June 2008 J. of Computational and Applied Mathematics (to appear) Projected Equation Methods for
Introduction Least Squares Q-Learning Variants with Reduced Computation Summary Q-learning Algorithms for Optimal Stopping Based on Least
LIDS REPORT 2697 1 Convergence Results for Some Temporal Difference Methods
New Error Bounds for Approximations from Projected Linear Equations
Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
Approximate Solution Methods for Partially Observable Markov and Semi-Markov Decision Processes
MATHEMATICS OF OPERATIONS RESEARCH Vol. 33, No. 1, February 2008, pp. 111
Least Squares Temporal Difference Methods: An Analysis Under General Conditions
Combining expert advice Jyrki Kivinen
A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies
Game Theory and Reinforcement Learning with Applications to
New Error Bounds for Approximations from Projected Linear Equations