 
Summary: Homework 3
Due November 1, Tuesday
1. Consider a model with S = {s1, s2}, As1 = {a11, a12}, As2 = {a21, a22, a23},
p{s1s1, a11} = 1, p{s1s1, a12} = 0.5, p{s1s2, a21} = 1, p{s1s2, a22} = 0
and p{s1s2, a23} = 0.75.
a. Determine the chain structure of each deterministic stationary policy.
b. Classify the Markov Decision process problem.
2. Show that if all stationary deterministic policies are unichain, then all
stationary randomized policies are unichain.
3. A decision maker observes a discrete time system which moves between
states {s1, s2, s3, s4} according to the following transition probability matrix:
P =
0.3 0.4 0.2 0.1
0.2 0.3 0.5 0.0
