Home

About

Advanced Search

Browse by Discipline

Scientific Societies

E-print Alerts

Add E-prints

E-print Network
FAQHELPSITE MAPCONTACT US


  Advanced Search  

 
Summary of Notation Notation for book draft by Sutton and Barto, with additions by David Peter
 

Summary: 1
Summary of Notation
Notation for book draft by Sutton and Barto, with additions by David Peterş
son.
t = 0; 1; 2; : : : discrete time step
Basic Random Variables
s t 2 S state at time t
a t 2 A(s t ) action at time t
r t 2 ! reward at time t, due, like s t , to s t\Gamma1 and a t\Gamma1
R t 2 ! return following time t (Section 2.5)
Timeless Environmental Quantities
p a
s;s 0
probability of transition from state s to state s 0 under action a
ae(s; a) expected immediate reward from state s after taking action a
▀ a policy
▀(s; a) probability of taking action a in state s under policy ▀
V ▀ (s) value of state s under policy ▀ (expected return)
V \Lambda (s) value of state s under the optimal policy
Q ▀ (s; a) value of taking action a in state s under policy ▀

  

Source: Anderson, Charles W. - Department of Computer Science, Colorado State University

 

Collections: Computer Technologies and Information Sciences