Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs

Diederik M. Roijers, Erwin Walraven, Matthijs T.J. Spaan

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Iteratively solving a set of linear programs (LPs) is a common strategy for solving various decision-making problems in Artificial Intelligence, such as planning in multi-objective or partially observable Markov Decision Processes (MDPs). A prevalent feature is that the solutions to these LPs become increasingly similar as the solving algorithm converges, because the solution computed by the algorithm approaches the fixed point of a Bellman backup operator. In this paper, we propose to speed up the solving process of these LPs by bootstrapping based on similar LPs solved previously. We use these LPs to initialize a subset of relevant LP constraints, before iteratively generating the remaining constraints. The resulting algorithm is the first to consider such information sharing across iterations. We evaluate our approach on planning in Multi-Objective MDPs (MOMDPs) and Partially Observable MDPs (POMDPs), showing that it solves fewer LPs than the state of the art, which leads to a significant speed-up. Moreover, for MOMDPs we show that our method scales better in both the number of states and the number of objectives, which is vital for multi-objective planning.

Original languageEnglish
Title of host publicationProceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling
EditorsMathijs de Weerdt, Sven Koenig, Gabriele Röger, Matthijs Spaan
Place of PublicationPalo Alto, CA
PublisherThe AAAI Press
Pages218-226
Number of pages9
ISBN (Print)9781577357971
Publication statusPublished - 2018
Event28th International Conference on Automated Planning and Scheduling, ICAPS 2018 - Delft, Netherlands
Duration: 24 Jun 201829 Jun 2018

Publication series

NameAAAI Press Proceedings
PublisherThe AAAI Press
ISSN (Electronic)2334-0843

Conference

Conference28th International Conference on Automated Planning and Scheduling, ICAPS 2018
Country/TerritoryNetherlands
CityDelft
Period24/06/1829/06/18

Funding

Diederik M. Roijers is a postdoctoral fellow of the Research Foundation – Flanders (FWO). The research by Erwin Wal-raven is funded by the Netherlands Organisation for Scientific Research (NWO), as part of the Uncertainty Reduction in Smart Energy Systems (URSES) program.

FundersFunder number
Nederlandse Organisatie voor Wetenschappelijk Onderzoek

    Fingerprint

    Dive into the research topics of 'Bootstrapping LPs in Value Iteration for Multi-Objective and Partially Observable MDPs'. Together they form a unique fingerprint.

    Cite this