TY - GEN
T1 - Learning to play donkey kong using neural networks and reinforcement learning
AU - Ozkohen, Paul
AU - Visser, Jelle
AU - van Otterlo, Martijn
AU - Wiering, Marco
PY - 2018/2
Y1 - 2018/2
N2 - Neural networks and reinforcement learning have successfully been applied to various games, such as Ms. Pacman and Go. We combine multilayer perceptrons and a class of reinforcement learning algorithms known as actor-critic to learn to play the arcade classic Donkey Kong. Two neural networks are used in this study: the actor and the critic. The actor learns to select the best action given the game state; the critic tries to learn the value of being in a certain state. First, a base game-playing performance is obtained by learning from demonstration, where data is obtained from human players. After this off-line training phase we further improve the base performance using feedback from the critic. The critic gives feedback by comparing the value of the state before and after taking the action. Results show that an agent pre-trained on demonstration data is able to achieve a good baseline performance. Applying actor-critic methods, however, does usually not improve performance, in many cases even decreases it. Possible reasons include the game not fully being Markovian and other issues.
AB - Neural networks and reinforcement learning have successfully been applied to various games, such as Ms. Pacman and Go. We combine multilayer perceptrons and a class of reinforcement learning algorithms known as actor-critic to learn to play the arcade classic Donkey Kong. Two neural networks are used in this study: the actor and the critic. The actor learns to select the best action given the game state; the critic tries to learn the value of being in a certain state. First, a base game-playing performance is obtained by learning from demonstration, where data is obtained from human players. After this off-line training phase we further improve the base performance using feedback from the critic. The critic gives feedback by comparing the value of the state before and after taking the action. Results show that an agent pre-trained on demonstration data is able to achieve a good baseline performance. Applying actor-critic methods, however, does usually not improve performance, in many cases even decreases it. Possible reasons include the game not fully being Markovian and other issues.
KW - Actor-critic
KW - Donkey Kong
KW - Games
KW - Machine learning
KW - Neural networks
KW - Platformer
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85043256717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85043256717&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-76892-2_11
DO - 10.1007/978-3-319-76892-2_11
M3 - Conference contribution
AN - SCOPUS:85043256717
SN - 9783319768915
T3 - Communications in Computer and Information Science
SP - 145
EP - 160
BT - Artificial Intelligence - 29th Benelux Conference, BNAIC 2017, Revised Selected Papers
PB - Springer/Verlag
T2 - 29th Benelux Conference on Artificial Intelligence, BNAIC 2017
Y2 - 8 November 2017 through 9 November 2017
ER -