Reweighting the UK Biobank to reflect its underlying sampling population substantially reduces pervasive selection bias due to volunteering

Research output: Working paper / PreprintPreprintProfessional

Abstract

The UK Biobank (UKB) is a large cohort study of considerable empirical importance to fields such as medicine, epidemiology, statistical genetics, and the social sciences, due to its very large size (∼ 500,000 individuals) and its wide availability of variables. However, the UKB is not representative of its underlying sampling population. Selection bias due to volunteering (volunteer bias) is a known source of confounding. Individuals entering the UKB are more likely to be older, to be female, and of higher socioeconomic status. Using representative microdata from the UK Census as a reference, we document significant bias in estimated associations due to non-random selection into the UKB. For some associations, volunteer bias in the UKB is so severe that estimates have the opposite sign. E.g., older individuals in the UKB tend to be in better health. To aid researchers in correcting for volunteer bias in the UKB, we construct inverse probability weights based on UK census microdata. The use of these weights in weighted regressions reduces 78% of volunteer bias on average. Our inverse probability weights will be made available.
Original languageEnglish
Pages1-56
Number of pages56
DOIs
Publication statusPublished - 16 May 2022

Fingerprint

Dive into the research topics of 'Reweighting the UK Biobank to reflect its underlying sampling population substantially reduces pervasive selection bias due to volunteering'. Together they form a unique fingerprint.

Cite this