Do we agree? Interrater reliability in education Zijn we het eens? Interbeoordelaarsbetrouwbaarheid in de pedagogiek en het onderwijs

L.A. Van Der Ark, D. Ten Hove

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

© 2019 Vereniging voor Onderwijsresearch (VOR). All Rights Reserved.Ratings and assessments are daily practices in education: Teachers decide whether a student's behaviour should be punished, teachers grade students' theses, and child-protection officers give complex assessments of juvenile delinquents. The interrater reliability (IRR) expresses the level of agreement among raters. We discuss three questions. The first question- How to determine the IRR? -cannot be answered unequivocally. Many coefficients are available to estimate the IRR, but often produce different results. The choice among coefficients depends on the research goal, research design, and personal preference. The answer to the second question-How to increase the IRR?-entails increasing the expertise of the raters', and the quality of the items, rubrics and procedure. To answer the third question-When is the IRR high enough?-we present a method to transform benchmarks for one IRR coefficient to another, but methodological research is required to provide a better answer to this question.
Original languageEnglish
Pages (from-to)361-371
JournalPedagogische Studien
Volume95
Issue number5-6
Publication statusPublished - 2019
Externally publishedYes

Fingerprint

Dive into the research topics of 'Do we agree? Interrater reliability in education Zijn we het eens? Interbeoordelaarsbetrouwbaarheid in de pedagogiek en het onderwijs'. Together they form a unique fingerprint.

Cite this