Bias in regression analysis: Problems and solutions

Noah Alexandra Schuster

    Research output: PhD ThesisPhD-Thesis - Research and graduation internal

    253 Downloads (Pure)

    Abstract

    Background Epidemiologists are generally interested in the effect of an exposure on an outcome. This so-called exposure effect is often estimated using regression analysis, in which the outcome is regressed on the exposure. The distribution of the outcome determines which regression technique is most appropriate to estimate the exposure effect. In epidemiological research, linear- (for continuous outcomes), logistic- (for binary outcomes) and Cox regression (for survival outcomes) are most commonly applied. In general, the aim is to isolate the true effect of the exposure on the outcome. However, often the association between an exposure and an outcome is not entirely attributable to the exposure, i.e., the effect is biased. If this bias is not accounted for, then the estimated effect is not a good representation of the true underlying effect. This could, for instance, result in under- or overtreatment in patients and influence clinical decision making. Aim In this thesis, I provide non-technical and non-mathematical descriptions of various situations in which bias can occur in regression analysis and propose solutions where possible. I focus on four potential sources of bias: the estimation of non-linear effects, noncollapsibility, causal mediation analysis and competing risks. In each chapter the theory is illustrated using an empirical data example from the Longitudinal Aging Study Amsterdam or the Amsterdam Growth and Health Longitudinal Study. Some chapters additionally contain a simulation study to evaluate model performance and compare methods. Conclusion Although regression models are commonly used in epidemiological research to estimate exposure effects, researchers do often not consider the many different ways in which bias can occur. In this thesis, I reviewed four different potential sources of bias in regression analysis, and proposed solutions where possible. To avoid bias, it is recommended that researchers consider the potential sources in the pre-analysis phase and adapt their analysis if necessary. In addition, it is recommended to transparently report the measures taken to reduce bias and to carefully interpret the results, taking any remaining bias into consideration. Finally, I believe that the field of epidemiology would benefit from more non-technical and non-mathematical papers on advanced topics, as I aimed to contribute to with this thesis.
    Original languageEnglish
    QualificationPhD
    Awarding Institution
    • Vrije Universiteit Amsterdam
    Supervisors/Advisors
    • Twisk, J.W.R., Supervisor
    • Heymans, Martinus Wilhelmus, Co-supervisor, -
    • Rijnhart, Judith Johanna Maria, Co-supervisor, -
    Award date9 Nov 2022
    Publication statusPublished - 9 Nov 2022

    Keywords

    • Regression analysis
    • bias
    • non-linearity
    • confounding
    • non-collapsibility
    • competing risk
    • mediation analysis

    Fingerprint

    Dive into the research topics of 'Bias in regression analysis: Problems and solutions'. Together they form a unique fingerprint.

    Cite this