Reinforcement learning as heuristic for action-rule preferences

Joost Broekens*, Koen Hindriks, Pascal Wiggers

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review


A common action selection mechanism used in agent-oriented programming is to base action selection on a set of rules. Since rules need not be mutually exclusive, agents are often underspecified. This means that the decision-making of such agents leaves room for multiple choices of actions. Underspecification implies there is potential for improvement or optimalization of the agent's behavior. Such optimalization, however, is not always naturally coded using BDI-like agent concepts. In this paper, we propose an approach to exploit this potential for improvement using reinforcement learning. This approach is based on learning rule priorities to solve the rule-selection problem, and we show that using this approach the behavior of an agent is significantly improved. Key here is the use of a state representation that combines the set of rules of the agent with a domain-independent heuristic based on the number of active goals. Our experiments show that this provides a useful generic base for learning while avoiding the state-explosion problem or overfitting.

Original languageEnglish
Title of host publicationProgramming Multi-Agent Systems - 8th International Workshop, ProMAS 2010, Revised Selected Papers
Number of pages16
Publication statusPublished - 9 Apr 2012
Externally publishedYes
Event8th International Workshop on Programming Multi-Agent Systems, ProMAS 2010 - Toronto, ON, Canada
Duration: 11 May 201011 May 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6599 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference8th International Workshop on Programming Multi-Agent Systems, ProMAS 2010
CityToronto, ON


  • Agent-oriented programming
  • reinforcement learning
  • rule preferences


Dive into the research topics of 'Reinforcement learning as heuristic for action-rule preferences'. Together they form a unique fingerprint.

Cite this