Abstract
In this article, we develop a novel role for the initial function v0 in the value iteration algorithm. In case the optimal policy of a countable state Markovian queueing control problem has a threshold or switching curve structure, we conjecture, that one can tune the choice of v0 to generate monotonic sequences of n-stage threshold or switching curve optimal policies. We will show this for three queueing control models, the M/M/1 queue with admission and with service control, and the two-competing queues model with quadratic holding cost. As a consequence, we obtain increasingly tighter upper and lower bounds. After a finite number of iterations, either the optimal threshold, or the optimal switching curve values in a finite number of states is available. This procedure can be used to increase numerical efficiency.
Original language | English |
---|---|
Pages (from-to) | 638-659 |
Number of pages | 22 |
Journal | Naval Research Logistics |
Volume | 65 |
Issue number | 8 |
Early online date | 28 Dec 2018 |
DOIs | |
Publication status | Published - Dec 2018 |
Bibliographical note
Special Issue: Pete Veinott (Volume 1)Keywords
- deriving bounds
- optimal policies
- value iteraton