On the reliability of a dental OSCE, using SEM: effect of different days

M.E. Schoonheim-Klein, A. Muijtens, L.L.M.H. Habets, M. Manogue, C. van der Vleuten, J. Hoogstraten, U. van der Velden

    Research output: Contribution to JournalArticleAcademic


    Aim: The first aim was to study the reliability of a dental objective structured clinical examination (OSCE) administered over multiple days, and the second was to assess the number of test stations required for a sufficiently reliable decision in three score interpretation perspectives of a dental OSCE administered over multiple days.
    Materials and methods: In four OSCE administrations, 463 students of the year 2005 and 2006 took the summative OSCE after a dental course in comprehensive dentistry. The OSCE had 16-18 5-min stations (scores 1-10), and was administered per OSCE on four different days of 1 week. ANOVA was used to test for examinee performance variation across days. Generalizability theory was used for reliability analyses. Reliability was studied from three interpretation perspectives: for relative (norm) decisions, for absolute (domain) and pass-fail (mastery) decisions.
    As an indicator of reproducibility of test scores in this dental OSCE, the standard error of measurement (SEM) was used. The benchmark of SEM was set at <0.51. This is corresponding to a 95% confidence interval (CI) of <1 on the original scoring scale that ranged from 1 to 10.
    Results: The mean weighted total OSCE score was 7.14 on a 10-point scale. With the pass-fail score set at 6.2 for the four OSCE, 90% of the 463 students passed.
    There was no significant increase in scores over the different days the OSCE was administered. 'Wished' variance owing to students was 6.3%. Variance owing to interaction between student and stations and residual error was 66.3%, more than two times larger than variance owing to stations' difficulty (27.4%). The SEM norm was 0.42 with a CI of ±0.83 and the SEM domain was 0.50, with a CI of ±0.98. In order to make reliable relative decisions (SEM <0.51), the use of minimal 12 stations is necessary, and for reliable absolute and pass-fail decisions, the use of minimal 17 stations is necessary in this dental OSCE.
    Conclusions: It appeared reliable, when testing large numbers of students, to administer the OSCE on different days. In order to make reliable decisions for this dental OSCE, minimum 17 stations are needed. Clearly, wide sampling of stations is at the heart of obtaining reliable scores in OSCE, also in dental education.
    Original languageUndefined/Unknown
    Pages (from-to)131-137
    Number of pages7
    JournalEuropean journal of dental education
    Issue number3
    Publication statusPublished - 2008

    Cite this