Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity

Nakao, Hideaki; Jiang, Ruiwei; Shen, Siqian

doi:10.1137/19m1268410

Title: Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity

Abstract

In this paper, we consider a distributionally robust partially observable Markov decision process (DR-POMDP), where the distribution of the transition-observation probabilities is unknown at the beginning of each decision period, but their realizations can be inferred using side information at the end of each period after an action being taken. We build an ambiguity set of the joint distribution using bounded moments via conic constraints and seek an optimal policy to maximize the worst-case (minimum) reward for any distribution in the set. We show that the value function of DR-POMDP is piecewise linear convex with respect to the belief state and propose a heuristic search value iteration method for obtaining lower and upper bounds of the value function. We conduct numerical studies and demonstrate the computational performance of our approach via testing instances of a dynamic epidemic control problem. Our results show that DR-POMDP can produce more robust policies under misspecified distributions of transition-observation probabilities as compared to POMDP but has less costly solutions than robust POMDP. The DR-POMDP policies are also insensitive to varying parameter in the ambiguity set and to noise added to the true transition-observation probability values obtained at the end of each decision period.

Authors:

Nakao, Hideaki ^[1];

^[1];

^[1]

Univ. of Michigan, Ann Arbor, MI (United States)

Publication Date:: Mon Feb 01 00:00:00 EST 2021

Research Org.:: Univ. of Michigan, Ann Arbor, MI (United States)

Sponsoring Org.:: USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR); National Science Foundation (NSF)

OSTI Identifier:: 1785682

Grant/Contract Number:: SC0018018; CMMI-1727618

Resource Type:: Accepted Manuscript

Journal Name:: SIAM Journal on Optimization

Additional Journal Information:: Journal Volume: 31; Journal Issue: 1; Related Information: Hideaki Nakao, Ruiwei Jiang, Siqian Shen, “Distributionally robust Partially Observable Markov Decision Process with moment-based ambiguity,” SIAM Journal on Optimization (SIOPT), 31(1), 461–488, 2021.; Journal ID: ISSN 1052-6234

Publisher:: SIAM

Country of Publication:: United States

Language:: English

Subject:: 97 MATHEMATICS AND COMPUTING; partially observable Markov decision process; POMDP; distributionally robust optimization; moment-based ambiguity set; heuristic search value iteration; HSVI; epidemic control

Citation Formats


                    Nakao, Hideaki, Jiang, Ruiwei, and Shen, Siqian. Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity.  United States: N. p., 2021. 
Web.  doi:10.1137/19m1268410.

Copy to clipboard


                    Nakao, Hideaki, Jiang, Ruiwei, & Shen, Siqian. Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity.  United States.  https://doi.org/10.1137/19m1268410

Copy to clipboard


                    Nakao, Hideaki, Jiang, Ruiwei, and Shen, Siqian. Mon .  
"Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity".  United States.  https://doi.org/10.1137/19m1268410.  https://www.osti.gov/servlets/purl/1785682.

Copy to clipboard


                    
@article{osti_1785682,

  title        = {Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity},

  author       = {Nakao, Hideaki and Jiang, Ruiwei and Shen, Siqian},

  abstractNote = {In this paper, we consider a distributionally robust partially observable Markov decision process (DR-POMDP), where the distribution of the transition-observation probabilities is unknown at the beginning of each decision period, but their realizations can be inferred using side information at the end of each period after an action being taken. We build an ambiguity set of the joint distribution using bounded moments via conic constraints and seek an optimal policy to maximize the worst-case (minimum) reward for any distribution in the set. We show that the value function of DR-POMDP is piecewise linear convex with respect to the belief state and propose a heuristic search value iteration method for obtaining lower and upper bounds of the value function. We conduct numerical studies and demonstrate the computational performance of our approach via testing instances of a dynamic epidemic control problem. Our results show that DR-POMDP can produce more robust policies under misspecified distributions of transition-observation probabilities as compared to POMDP but has less costly solutions than robust POMDP. The DR-POMDP policies are also insensitive to varying parameter in the ambiguity set and to noise added to the true transition-observation probability values obtained at the end of each decision period.},

  doi          = {10.1137/19m1268410},

  journal      = {SIAM Journal on Optimization},

  number       = 1,

  volume       = 31,

  place        = {United States},

  year         = {Mon Feb 01 00:00:00 EST 2021},

  month        = {Mon Feb 01 00:00:00 EST 2021}

}

Copy to clipboard

Journal Article:

Free Publicly Available Full Text

Accepted Manuscript (DOE)

Publisher's Version of Record

https://doi.org/10.1137/19m1268410

Other availability

Search WorldCat to find libraries that may hold this journal

Save / Share:

Export Metadata

Save to My Library

Works referenced in this record:

Perturbation and stability theory for Markov control problems
journal, January 1992

Abbad, M.; Filar, J. A.
IEEE Transactions on Automatic Control, Vol. 37, Issue 9
DOI: 10.1109/9.159584

Robust Solutions of Optimization Problems Affected by Uncertain Probabilities
journal, February 2013

Ben-Tal, Aharon; den Hertog, Dick; De Waegenaere, Anja
Management Science, Vol. 59, Issue 2
DOI: 10.1287/mnsc.1120.1641

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
journal, February 2010

Delage, Erick; Mannor, Shie
Operations Research, Vol. 58, Issue 1
DOI: 10.1287/opre.1080.0685

Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems
journal, June 2010

Delage, Erick; Ye, Yinyu
Operations Research, Vol. 58, Issue 3
DOI: 10.1287/opre.1090.0741

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations
journal, July 2017

Mohajerin Esfahani, Peyman; Kuhn, Daniel
Mathematical Programming, Vol. 171, Issue 1-2
DOI: 10.1007/s10107-017-1172-1

A new distribution-free quantile estimator
journal, January 1982

Harrell, Frank E.; Davis, C. E.
Biometrika, Vol. 69, Issue 3
DOI: 10.1093/biomet/69.3.635

Planning treatment of ischemic heart disease with partially observable Markov decision processes
journal, March 2000

Hauskrecht, Milos; Fraser, Hamish
Artificial Intelligence in Medicine, Vol. 18, Issue 3
DOI: 10.1016/S0933-3657(99)00042-1

The Mathematics of Infectious Diseases
journal, January 2000

Hethcote, Herbert W.
SIAM Review, Vol. 42, Issue 4
DOI: 10.1137/S0036144500371907

Robust Dynamic Programming
journal, May 2005

Iyengar, Garud N.
Mathematics of Operations Research, Vol. 30, Issue 2
DOI: 10.1287/moor.1040.0129

Data-driven chance constrained stochastic program
journal, July 2015

Jiang, Ruiwei; Guan, Yongpei
Mathematical Programming, Vol. 158, Issue 1-2
DOI: 10.1007/s10107-015-0929-7

Monitoring epidemiologic surveillance data using hidden Markov models
journal, December 1999

Le Strat, Yann; Carrat, Fabrice
Statistics in Medicine, Vol. 18, Issue 24
DOI: 10.1002/(SICI)1097-0258(19991230)18:24<3463::AID-SIM409>3.0.CO;2-I

Robust MDPs with k -Rectangular Uncertainty
journal, November 2016

Mannor, Shie; Mebel, Ofir; Xu, Huan
Mathematics of Operations Research, Vol. 41, Issue 4
DOI: 10.1287/moor.2016.0786

Robust Control of Markov Decision Processes with Uncertain Transition Matrices
journal, October 2005

Nilim, Arnab; El Ghaoui, Laurent
Operations Research, Vol. 53, Issue 5
DOI: 10.1287/opre.1050.0216

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon
journal, October 1973

Smallwood, Richard D.; Sondik, Edward J.
Operations Research, Vol. 21, Issue 5
DOI: 10.1287/opre.21.5.1071

Adaptive Inventory Control for Nonstationary Demand and Partial Information
journal, May 2002

Treharne, James T.; Sox, Charles R.
Management Science, Vol. 48, Issue 5
DOI: 10.1287/mnsc.48.5.607.7807

Robust Markov Decision Processes
journal, February 2013

Wiesemann, Wolfram; Kuhn, Daniel; Rustem, Berç
Mathematics of Operations Research, Vol. 38, Issue 1
DOI: 10.1287/moor.1120.0566

Distributionally Robust Convex Optimization
journal, December 2014

Wiesemann, Wolfram; Kuhn, Daniel; Sim, Melvyn
Operations Research, Vol. 62, Issue 6
DOI: 10.1287/opre.2014.1314

Distributionally Robust Markov Decision Processes
journal, May 2012

Xu, Huan; Mannor, Shie
Mathematics of Operations Research, Vol. 37, Issue 2
DOI: 10.1287/moor.1120.0540

A Convex Optimization Approach to Distributionally Robust Markov Decision Processes With Wasserstein Distance
journal, July 2017

Yang, Insoon
IEEE Control Systems Letters, Vol. 1, Issue 1
DOI: 10.1109/LCSYS.2017.2711553

Distributionally Robust Counterpart in Markov Decision Processes
journal, September 2016

Yu, Pengqian; Xu, Huan
IEEE Transactions on Automatic Control, Vol. 61, Issue 9
DOI: 10.1109/TAC.2015.2495174

Distributionally robust joint chance constraints with second-order moment information
journal, November 2011

Zymler, Steve; Kuhn, Daniel; Rustem, Berç
Mathematical Programming, Vol. 137, Issue 1-2
DOI: 10.1007/s10107-011-0494-7

Algorithms for singularly perturbed limiting average Markov control problems
conference, January 1990

Abbad, M.; Filar, J. A.; Bielecki, T. R.
29th IEEE Conference on Decision and Control
DOI: 10.1109/cdc.1990.203841

Adaptive Inventory Control for Nonstationary Demand and Partial Information
journal, May 2002

Treharne, James T.; Sox, Charles R.
Management Science, Vol. 48, Issue 5
DOI: 10.1287/mnsc.48.5.607.7807

Distributionally Robust Convex Optimization
journal, December 2014

Wiesemann, Wolfram; Kuhn, Daniel; Sim, Melvyn
Operations Research, Vol. 62, Issue 6
DOI: 10.1287/opre.2014.1314

Similar Records in DOE PAGES and OSTI.GOV collections:

Multistage distributionally robust mixed-integer programming with decision-dependent moment-based ambiguity sets

Journal Article Yu, Xian ; Shen, Siqian - Mathematical Programming

We study multistage distributionally robust mixed-integer programs under endogenous uncertainty, where the probability distribution of stage-wise uncertainty depends on the decisions made in previous stages. We first consider two ambiguity sets defined by decision-dependent bounds on the first and second moments of uncertain parameters and by mean and covariance matrix that exactly match decision-dependent empirical ones, respectively. For both sets, we show that the subproblem in each stage can be recast as a mixed-integer linear program (MILP). Moreover, we extend the general moment-based ambiguity set in to the multistage decision-dependent setting, and derive mixed-integer semidefinite programming (MISDP) reformulations of stage-wisemore »« less
https://doi.org/10.1007/s10107-020-01580-4

Full Text Available
Distributionally robust facility location problem under decision-dependent stochastic demand

Journal Article Basciftci, Beste ; Ahmed, Shabbir ; Shen, Siqian - European Journal of Operational Research

While the traditional facility location problem considers exogenous demand, in some applications, locations of facilities could affect the willingness of customers to use certain types of services, e.g., carsharing, and therefore they also affect realizations of random demand. Moreover, a decision maker may not know the exact distribution of such endogenous demand and how it is affected by location choices. In this paper, we consider a distributionally robust facility location problem, in which we interpret the moments of stochastic demand as functions of facility-location decisions. We reformulate a two-stage decision-dependent distributionally robust optimization model as a monolithic formulation, and thenmore »« less
https://doi.org/10.1016/j.ejor.2020.11.002

Full Text Available
A multistage distributionally robust optimization approach to water allocation under climate uncertainty

Journal Article Park, Jangho ; Bayraksan, Güzin - European Journal of Operational Research

This paper investigates a Multistage Distributionally Robust Optimization (MDRO) approach to water allocation under climate uncertainty. The MDRO is formed by creating sets of conditional distributions (called conditional ambiguity sets) on a finite scenario tree. The distributions in the conditional ambiguity sets remain close to a nominal conditional distribution according a ø-divergence (e.g., Kullback-Leibler divergence, Hellinger distance, Burg entropy, etc.). Here, the paper discusses a decomposition algorithm to solve the resulting MDRO with ø-divergences, which uses the dual formulation and solves only linear subproblems instead of convex ones. Some properties of the algorithm such as generating feasible policies and validmore »« less
https://doi.org/10.1016/j.ejor.2022.06.049

Full Text Available
Distributionally Robust Decision Making Leveraging Conditional Distributions

Conference Chen, Yuxiao ; Kim, Jip ; Anderson, James - 2022 IEEE 61st Conference on Decision and Control (CDC)

Distributionally robust optimization (DRO) is a powerful tool for decision making under uncertainty. It is particularly appealing because of its ability to leverage existing data. However, many practical problems call for decision-making with some auxiliary information, and DRO in the context of conditional distributions is not straightforward. We propose a conditional kernel distributionally robust optimization (CKDRO) method that enables robust decision making under conditional distributions through kernel DRO and the conditional mean operator in the reproducing kernel Hilbert space (RKHS). In particular, we consider problems where there is a correlation between the unknown variable y and an auxiliary observable variablemore »« less
https://doi.org/10.1109/cdc51059.2022.9992782

Full Text Available
Resource distribution under spatiotemporal uncertainty of disease spread: Stochastic versus robust approaches

Journal Article Basciftci, Beste ; Yu, Xian ; Shen, Siqian - Computers and Operations Research

We consider the problem of optimizing locations of distribution centers (DCs) and plans for distributing resources such as test kits and vaccines, under spatiotemporal uncertainties of disease spread and demand for the resources. We aim to balance the operational cost (including costs of deploying facilities, shipping, and storage) and quality of service (reflected by demand coverage), while ensuring equity and fairness of resource distribution across multiple populations. We compare a sample-based stochastic programming (SP) approach with a distributionally robust optimization (DRO) approach using a moment-based ambiguity set. Numerical studies are conducted on instances of distributing COVID-19 vaccines in the Unitedmore »« less
https://doi.org/10.1016/j.cor.2022.106028

Full Text Available

Similar Records

Title: Distributionally Robust Partially Observable Markov Decision Process with Moment-Based Ambiguity

Abstract

Citation Formats

Perturbation and stability theory for Markov control problems journal, January 1992

Robust Solutions of Optimization Problems Affected by Uncertain Probabilities journal, February 2013

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty journal, February 2010

Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems journal, June 2010

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations journal, July 2017

A new distribution-free quantile estimator journal, January 1982

Planning treatment of ischemic heart disease with partially observable Markov decision processes journal, March 2000

The Mathematics of Infectious Diseases journal, January 2000

Robust Dynamic Programming journal, May 2005

Data-driven chance constrained stochastic program journal, July 2015

Monitoring epidemiologic surveillance data using hidden Markov models journal, December 1999

Robust MDPs with k -Rectangular Uncertainty journal, November 2016

Robust Control of Markov Decision Processes with Uncertain Transition Matrices journal, October 2005

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon journal, October 1973

Adaptive Inventory Control for Nonstationary Demand and Partial Information journal, May 2002

Robust Markov Decision Processes journal, February 2013

Distributionally Robust Convex Optimization journal, December 2014

Distributionally Robust Markov Decision Processes journal, May 2012

A Convex Optimization Approach to Distributionally Robust Markov Decision Processes With Wasserstein Distance journal, July 2017

Distributionally Robust Counterpart in Markov Decision Processes journal, September 2016

Distributionally robust joint chance constraints with second-order moment information journal, November 2011

Algorithms for singularly perturbed limiting average Markov control problems conference, January 1990

Adaptive Inventory Control for Nonstationary Demand and Partial Information journal, May 2002

Distributionally Robust Convex Optimization journal, December 2014

Perturbation and stability theory for Markov control problems
journal, January 1992

Robust Solutions of Optimization Problems Affected by Uncertain Probabilities
journal, February 2013

Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
journal, February 2010

Distributionally Robust Optimization Under Moment Uncertainty with Application to Data-Driven Problems
journal, June 2010

Data-driven distributionally robust optimization using the Wasserstein metric: performance guarantees and tractable reformulations
journal, July 2017

A new distribution-free quantile estimator
journal, January 1982

Planning treatment of ischemic heart disease with partially observable Markov decision processes
journal, March 2000

The Mathematics of Infectious Diseases
journal, January 2000

Robust Dynamic Programming
journal, May 2005

Data-driven chance constrained stochastic program
journal, July 2015

Monitoring epidemiologic surveillance data using hidden Markov models
journal, December 1999

Robust MDPs with k -Rectangular Uncertainty
journal, November 2016

Robust Control of Markov Decision Processes with Uncertain Transition Matrices
journal, October 2005

The Optimal Control of Partially Observable Markov Processes over a Finite Horizon
journal, October 1973

Adaptive Inventory Control for Nonstationary Demand and Partial Information
journal, May 2002

Robust Markov Decision Processes
journal, February 2013

Distributionally Robust Convex Optimization
journal, December 2014

Distributionally Robust Markov Decision Processes
journal, May 2012

A Convex Optimization Approach to Distributionally Robust Markov Decision Processes With Wasserstein Distance
journal, July 2017

Distributionally Robust Counterpart in Markov Decision Processes
journal, September 2016

Distributionally robust joint chance constraints with second-order moment information
journal, November 2011

Algorithms for singularly perturbed limiting average Markov control problems
conference, January 1990

Adaptive Inventory Control for Nonstationary Demand and Partial Information
journal, May 2002

Distributionally Robust Convex Optimization
journal, December 2014