Pitfalls in quantifying exploration in reward-based motor learning and how to avoid them

Nina M. van Mastrigt*, Katinka van der Kooij, Jeroen B.J. Smeets

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

When learning a movement based on binary success information, one is more variable following failure than following success. Theoretically, the additional variability post-failure might reflect exploration of possibilities to obtain success. When average behavior is changing (as in learning), variability can be estimated from differences between subsequent movements. Can one estimate exploration reliably from such trial-to-trial changes when studying reward-based motor learning? To answer this question, we tried to reconstruct the exploration underlying learning as described by four existing reward-based motor learning models. We simulated learning for various learner and task characteristics. If we simply determined the additional change post-failure, estimates of exploration were sensitive to learner and task characteristics. We identified two pitfalls in quantifying exploration based on trial-to-trial changes. Firstly, performance-dependent feedback can cause correlated samples of motor noise and exploration on successful trials, which biases exploration estimates. Secondly, the trial relative to which trial-to-trial change is calculated may also contain exploration, which causes underestimation. As a solution, we developed the additional trial-to-trial change (ATTC) method. By moving the reference trial one trial back and subtracting trial-to-trial changes following specific sequences of trial outcomes, exploration can be estimated reliably for the three models that explore based on the outcome of only the previous trial. Since ATTC estimates are based on a selection of trial sequences, this method requires many trials. In conclusion, if exploration is a binary function of previous trial outcome, the ATTC method allows for a model-free quantification of exploration.

Original languageEnglish
Pages (from-to)365-382
Number of pages18
JournalBiological Cybernetics
Volume115
Issue number4
Early online date2 Aug 2021
DOIs
Publication statusPublished - Aug 2021

Bibliographical note

Funding Information:
The research was funded by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek, Toegepaste en Technische Wetenschappen (NWO-TTW), by the Open Technologie Programma (OTP) grant 15989 awarded to Jeroen Smeets. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Publisher Copyright:
© 2021, The Author(s).

Keywords

  • Exploration
  • Motor learning
  • Reinforcement
  • Reward
  • Trial-by-trial analysis
  • Variability

Fingerprint

Dive into the research topics of 'Pitfalls in quantifying exploration in reward-based motor learning and how to avoid them'. Together they form a unique fingerprint.

Cite this