TY - GEN
T1 - Understanding the Behavior of Reinforcement Learning Agents
AU - Stork, Jörg
AU - Zaefferer, Martin
AU - Bartz-Beielstein, Thomas
AU - Eiben, A. E.
PY - 2020
Y1 - 2020
N2 - Reinforcement Learning (RL) is the process of training agents to solve specific tasks, based on measures of reward. Understanding the behavior of an agent in its environment can be crucial. For instance, if users understand why specific agents fail at a task, they might be able to define better reward functions, to steer the agents’ development in the right direction. Understandability also empowers decisions for agent deployment. If we know why the controller of an autonomous car fails or excels in specific traffic situations, we can make better decisions on whether/when to use them in practice. We aim to facilitate the understandability of RL. To that end, we investigate and observe the behavioral space: the set of actions of an agent observed for a set of input states. Consecutively, we develop measures of distance or similarity in that space and analyze how agents compare in their behavior. Moreover, we investigate which states and actions are critical for a task, and determine the correlation between reward and behavior. We utilize two basic RL environments to investigate our measures. The results showcase the high potential of inspecting an agents’ behavior and comparing their distance in behavior space.
AB - Reinforcement Learning (RL) is the process of training agents to solve specific tasks, based on measures of reward. Understanding the behavior of an agent in its environment can be crucial. For instance, if users understand why specific agents fail at a task, they might be able to define better reward functions, to steer the agents’ development in the right direction. Understandability also empowers decisions for agent deployment. If we know why the controller of an autonomous car fails or excels in specific traffic situations, we can make better decisions on whether/when to use them in practice. We aim to facilitate the understandability of RL. To that end, we investigate and observe the behavioral space: the set of actions of an agent observed for a set of input states. Consecutively, we develop measures of distance or similarity in that space and analyze how agents compare in their behavior. Moreover, we investigate which states and actions are critical for a task, and determine the correlation between reward and behavior. We utilize two basic RL environments to investigate our measures. The results showcase the high potential of inspecting an agents’ behavior and comparing their distance in behavior space.
KW - Behavior
KW - Reinforcement Learning
KW - Understandable AI
UR - http://www.scopus.com/inward/record.url?scp=85097232301&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097232301&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-63710-1_12
DO - 10.1007/978-3-030-63710-1_12
M3 - Conference contribution
AN - SCOPUS:85097232301
SN - 9783030637095
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 148
EP - 160
BT - Bioinspired Optimization Methods and Their Applications
A2 - Filipic, Bogdan
A2 - Minisci, Edmondo
A2 - Vasile, Massimiliano
PB - Springer Science and Business Media Deutschland GmbH
T2 - 9th International Conference on Bioinspired Optimization Methods and Their Applications, BIOMA 2020
Y2 - 19 November 2020 through 20 November 2020
ER -