Summary: 924 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS--PART B: CYBERNETICS, VOL. 38, NO. 4, AUGUST 2008
Random Sampling of States in Dynamic Programming
Christopher G. Atkeson and Benjamin J. Stephens
Abstract--We combine three threads of research on approxi-
mate dynamic programming: sparse random sampling of states,
value function and policy approximation using local models, and
using local trajectory optimizers to globally optimize a policy and
associated value function. Our focus is on finding steady-state poli-
cies for deterministic time-invariant discrete time control prob-
lems with continuous states and actions often found in robotics.
In this paper, we describe our approach and provide initial results
on several simulated robotics problems.
Index Terms--Dynamic programming, optimal control, random
DYNAMIC programming provides a way to find globally
optimal control laws (policies) u = u(x), which give the
appropriate action u for any state x , . Dynamic program-
ming takes as input a one-step cost (a.k.a. "reward" or "loss")
function and the dynamics of the problem to be optimized.