Inter-rater agreement and reliability of the COSMIN (Consensus-based Standards for the selection of health status Measurement Instruments) Checklist

L.B. Mokkink, C.B. Terwee, E. Gibbons, P.W. Stratford, J. Alonso, D.L. Patrick, D.L. Knol, L.M. Bouter, H.C.W. de Vet

    Research output: Contribution to JournalArticleAcademicpeer-review

    Abstract

    Background. The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114). Methods. 75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item. Results. 88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions. Conclusions. Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved. © 2010 Mokkink et al; licensee BioMed Central Ltd.
    Original languageEnglish
    Pages (from-to)82
    Number of pages11
    JournalBMC Medical Research Methodology
    Volume10
    DOIs
    Publication statusPublished - 2010

    Fingerprint

    Checklist
    Health Status
    Consensus
    Bibliographic Databases
    Terminology
    Health
    Patient Reported Outcome Measures

    Cite this

    Mokkink, L.B. ; Terwee, C.B. ; Gibbons, E. ; Stratford, P.W. ; Alonso, J. ; Patrick, D.L. ; Knol, D.L. ; Bouter, L.M. ; de Vet, H.C.W. / Inter-rater agreement and reliability of the COSMIN (Consensus-based Standards for the selection of health status Measurement Instruments) Checklist. In: BMC Medical Research Methodology. 2010 ; Vol. 10. pp. 82.
    @article{955d6df024674c88849fce4a6583cab8,
    title = "Inter-rater agreement and reliability of the COSMIN (Consensus-based Standards for the selection of health status Measurement Instruments) Checklist",
    abstract = "Background. The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114). Methods. 75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item. Results. 88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68{\%} was above 80{\%} agreement), and the kappa coefficients for the COSMIN items were low (61{\%} was below 0.40, 6{\%} was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions. Conclusions. Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved. {\circledC} 2010 Mokkink et al; licensee BioMed Central Ltd.",
    author = "L.B. Mokkink and C.B. Terwee and E. Gibbons and P.W. Stratford and J. Alonso and D.L. Patrick and D.L. Knol and L.M. Bouter and {de Vet}, H.C.W.",
    year = "2010",
    doi = "10.1186/1471-2288-10-82",
    language = "English",
    volume = "10",
    pages = "82",
    journal = "BMC Medical Research Methodology",
    issn = "1471-2288",
    publisher = "BioMed Central",

    }

    Inter-rater agreement and reliability of the COSMIN (Consensus-based Standards for the selection of health status Measurement Instruments) Checklist. / Mokkink, L.B.; Terwee, C.B.; Gibbons, E.; Stratford, P.W.; Alonso, J.; Patrick, D.L.; Knol, D.L.; Bouter, L.M.; de Vet, H.C.W.

    In: BMC Medical Research Methodology, Vol. 10, 2010, p. 82.

    Research output: Contribution to JournalArticleAcademicpeer-review

    TY - JOUR

    T1 - Inter-rater agreement and reliability of the COSMIN (Consensus-based Standards for the selection of health status Measurement Instruments) Checklist

    AU - Mokkink, L.B.

    AU - Terwee, C.B.

    AU - Gibbons, E.

    AU - Stratford, P.W.

    AU - Alonso, J.

    AU - Patrick, D.L.

    AU - Knol, D.L.

    AU - Bouter, L.M.

    AU - de Vet, H.C.W.

    PY - 2010

    Y1 - 2010

    N2 - Background. The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114). Methods. 75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item. Results. 88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions. Conclusions. Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved. © 2010 Mokkink et al; licensee BioMed Central Ltd.

    AB - Background. The COSMIN checklist is a tool for evaluating the methodological quality of studies on measurement properties of health-related patient-reported outcomes. The aim of this study is to determine the inter-rater agreement and reliability of each item score of the COSMIN checklist (n = 114). Methods. 75 articles evaluating measurement properties were randomly selected from the bibliographic database compiled by the Patient-Reported Outcome Measurement Group, Oxford, UK. Raters were asked to assess the methodological quality of three articles, using the COSMIN checklist. In a one-way design, percentage agreement and intraclass kappa coefficients or quadratic-weighted kappa coefficients were calculated for each item. Results. 88 raters participated. Of the 75 selected articles, 26 articles were rated by four to six participants, and 49 by two or three participants. Overall, percentage agreement was appropriate (68% was above 80% agreement), and the kappa coefficients for the COSMIN items were low (61% was below 0.40, 6% was above 0.75). Reasons for low inter-rater agreement were need for subjective judgement, and accustom to different standards, terminology and definitions. Conclusions. Results indicated that raters often choose the same response option, but that it is difficult on item level to distinguish between articles. When using the COSMIN checklist in a systematic review, we recommend getting some training and experience, completing it by two independent raters, and reaching consensus on one final rating. Instructions for using the checklist are improved. © 2010 Mokkink et al; licensee BioMed Central Ltd.

    U2 - 10.1186/1471-2288-10-82

    DO - 10.1186/1471-2288-10-82

    M3 - Article

    VL - 10

    SP - 82

    JO - BMC Medical Research Methodology

    JF - BMC Medical Research Methodology

    SN - 1471-2288

    ER -