Mean-Variance Problems for Finite Horizon Semi-Markov Decision Processes
- Sun Yat-Sen University, School of Mathematics and Computational Science (China)
This paper deals with a mean-variance problem for finite horizon semi-Markov decision processes. The state and action spaces are Borel spaces, while the reward function may be unbounded. The goal is to seek an optimal policy with minimal finite horizon reward variance over the set of policies with a given mean. Using the theory of N-step contraction, we give a characterization of policies with a given mean and convert the second order moment of the finite horizon reward to a mean of an infinite horizon reward/cost generated by a discrete-time Markov decision processes (MDP) with a two dimension state space and a new one-step reward/cost under suitable conditions. We then establish the optimality equation and the existence of mean-variance optimal policies by employing the existing results of discrete-time MDPs. We also provide a value iteration and a policy improvement algorithms for computing the value function and mean-variance optimal policies, respectively. In addition, a linear program and the dual program are developed for solving the mean-variance problem.
- OSTI ID:
- 22722847
- Journal Information:
- Applied Mathematics and Optimization, Vol. 72, Issue 2; Other Information: Copyright (c) 2015 Springer Science+Business Media New York; http://www.springer-ny.com; Country of input: International Atomic Energy Agency (IAEA); ISSN 0095-4616
- Country of Publication:
- United States
- Language:
- English
Similar Records
Nearly optimal control of singularly perturbed Markov decision processes in discrete time
K-Spin Hamiltonian for Quantum-Resolvable Markov Decision Processes