TY - JOUR
T1 - The sign of exploration during reward-based motor learning is not independent from trial to trial
AU - Kooij, Katinka van der
AU - Smeets, Jeroen B.J.
AU - Mastrigt, Nina M.van
AU - Wijk, Bernadette C.M.van
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/5
Y1 - 2025/5
N2 - Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such ‘reward-based motor learning’ relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.
AB - Humans can learn various motor tasks based on binary reward feedback on whether a movement attempt was successful or not. Such ‘reward-based motor learning’ relies on exploiting successful motor commands and exploring different motor commands following failure. Most computational models of reward-based motor learning have formalized exploration as a random process, in which on each trial a random draw is taken from a normal distribution centred on zero. Whether human motor exploration is indeed random from trial to trial has not been tested yet. Here we tested in a force production task whether human motor exploration is random. To this end, we compared the proportion trial-to-trial force changes in the behavioural data that have the same sign to the proportion expected in random exploration. One group of participants practiced with an adaptive reward criterion, which keeps rewarded performance close to current performance, and the other group practiced with a fixed reward criterion in which current performance can be far from reward performance. In both groups, we found a proportion same-sign changes larger than predicted. In the Adaptive group, both the learning and proportion same-sign changes were consistent with model simulations for low values of random exploration, whereas in the Fixed group both the learning and proportion same-sign changes were inconsistent with model simulations based on random exploration. This suggests that some form of non-random motor exploration contributes to reward-based motor learning.
KW - Exploration
KW - Motor learning
KW - Reinforcement learning
KW - Reward
UR - https://www.scopus.com/pages/publications/105002765741
UR - https://www.scopus.com/inward/citedby.url?scp=105002765741&partnerID=8YFLogxK
U2 - 10.1007/s00221-025-07074-z
DO - 10.1007/s00221-025-07074-z
M3 - Article
AN - SCOPUS:105002765741
SN - 0014-4819
VL - 243
JO - Experimental Brain Research
JF - Experimental Brain Research
IS - 5
M1 - 117
ER -