Optimal mixing of Markov decision rules for MDP control

Research output: Contribution to journalArticle

Abstract

In this article we study Markov decision process (MDP) problems with the restriction that at decision epochs, only a finite number of given Markov decision rules are admissible. For example, the set of admissible Markov decision rules D could consist of some easy-implementable decision rules. Additionally, many open-loop control problems can be modeled as an MDP with such a restriction on the admissible decision rules. Within the class of available policies, optimal policies are generally nonstationary and it is difficult to prove that some policy is optimal. We give an example with two admissible decision rules - D={d
LanguageEnglish
Pages307-342
JournalProbability in the Engineering and Informational Sciences
Volume25
Issue number3
DOIs
StatePublished - 2011

Cite this

@article{5f66ded2af8840fbb843fc616a524cd2,
title = "Optimal mixing of Markov decision rules for MDP control",
abstract = "In this article we study Markov decision process (MDP) problems with the restriction that at decision epochs, only a finite number of given Markov decision rules are admissible. For example, the set of admissible Markov decision rules D could consist of some easy-implementable decision rules. Additionally, many open-loop control problems can be modeled as an MDP with such a restriction on the admissible decision rules. Within the class of available policies, optimal policies are generally nonstationary and it is difficult to prove that some policy is optimal. We give an example with two admissible decision rules - D={d",
author = "{van der Laan}, D.A.",
year = "2011",
doi = "10.1017/S0269964811000039",
language = "English",
volume = "25",
pages = "307--342",
journal = "Probability in the Engineering and Informational Sciences",
issn = "0269-9648",
publisher = "Cambridge University Press",
number = "3",

}

Optimal mixing of Markov decision rules for MDP control. / van der Laan, D.A.

In: Probability in the Engineering and Informational Sciences, Vol. 25, No. 3, 2011, p. 307-342.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Optimal mixing of Markov decision rules for MDP control

AU - van der Laan,D.A.

PY - 2011

Y1 - 2011

N2 - In this article we study Markov decision process (MDP) problems with the restriction that at decision epochs, only a finite number of given Markov decision rules are admissible. For example, the set of admissible Markov decision rules D could consist of some easy-implementable decision rules. Additionally, many open-loop control problems can be modeled as an MDP with such a restriction on the admissible decision rules. Within the class of available policies, optimal policies are generally nonstationary and it is difficult to prove that some policy is optimal. We give an example with two admissible decision rules - D={d

AB - In this article we study Markov decision process (MDP) problems with the restriction that at decision epochs, only a finite number of given Markov decision rules are admissible. For example, the set of admissible Markov decision rules D could consist of some easy-implementable decision rules. Additionally, many open-loop control problems can be modeled as an MDP with such a restriction on the admissible decision rules. Within the class of available policies, optimal policies are generally nonstationary and it is difficult to prove that some policy is optimal. We give an example with two admissible decision rules - D={d

U2 - 10.1017/S0269964811000039

DO - 10.1017/S0269964811000039

M3 - Article

VL - 25

SP - 307

EP - 342

JO - Probability in the Engineering and Informational Sciences

T2 - Probability in the Engineering and Informational Sciences

JF - Probability in the Engineering and Informational Sciences

SN - 0269-9648

IS - 3

ER -