Systematic Gene Prioritization in Genome-Wide Association Studies

Research output: PhD ThesisPhD-Thesis - Research and graduation internal

3 Downloads (Pure)

Abstract

In chapter 1 we give a brief overview of genetic studies as a way to investigate disease. We discuss the relevance of GWAS studies, and how these are currently used to advance our understanding of disease. We discuss the current challenges that remain for the analysis of GWAS results. In chapter 2 we give an overview of computational tools and methods that can be used to interpret non-coding variants in a disease context. This serves mainly to guide researchers in making a decision as to which methods would most appropriately address their research question. Generally we observed that many different tools have been developed to aid the interpretation of GWAS variants in biological context. In chapter 3 we present FLAMES, a novel tool that aims to prioritize the most likely causal gene in a GWAS locus. Through extensive benchmarking we showed that FLAMES outperforms current state-of-the-art gene prioritization methods. As a proof of principle, we applied FLAMES to a GWAS of giving birth to dizygotic twins, and resolved the FSHB locus, which has long been the suspected effector gene in this GWAS locus, but was never formally indicated by gene-prioritization methods. We also applied FLAMES to a large schizophrenia GWAS and found that multiple effector genes which together showed strong enrichment for synaptic genes, showing that FLAMES is able to prioritize genes in previously indicated biological pathways. In addition we show that schizophrenia effector genes indicated a neurodevelopmental and non-neurodevelopmental gene-cluster based on expression levels throughout the lifespan. In chapter 4 we explored how model optimization can reduce the number of prediction features necessary for confident prediction, with reduced model complexity generally improving interpretability. As part of this work we developed CALDERA, a regression-based gene-prediction framework, which is shown to perform similarly to FLAMES (chapter 3) despite using considerably fewer features. Moreover, the CALDERA model provides greater flexibility, allowing for the straightforward inclusion of additional gene parameters that can be used to eliminate possible biases introduced in the training steps of CALDERA. In chapter 5 we showed that there are pervasive biases in the effector gene predictions generated by Polygenic Prioritization Scores (PoPS). This is problematic given the broad use of this tool within the field. These biases stem from a combination of feature scaling, including binary and continuous features and overrepresentation of specific genes in the binary features. Through simulations we showed that the observed biases are consistent and predictable, and that they likely impacted the benchmarking of PoPS and methods using PoPS. We show that the model is well calibrated when using only expression-features, and provide recommendations for further development of PoPS. In Chapter 6 we applied several post-GWAS analysis methods, including FLAMES, to elucidate the shared genetic aetiology of insomnia, depression and anxiety. The shared genetics of anxiety and depression have extensively been studied before, but the combined shared genetics of insomnia, anxiety and depression have not. We investigated these traits at the locus-, variant-, and gene-level in multiple ancestries, giving a complete overview of the shared genetics mechanisms that underly disease liability. We found broad convergence of signal (~1/3) on genome-wide significant loci. We find that the genes shared between these traits converge on neurodevelopmental, immunological and synaptic signalling pathways. In chapter 7 we place these findings into a broader context. We discuss how the chapters in this thesis have advanced the field of gene-prioritization We expand upon how these developments of this thesis have allowed for novel disease-gene associations, and the general benefits of integrative gene-prioritization frameworks. Lastly we broadly discuss what challenges still remain for the field regarding the topics discussed in this thesis.
Original languageEnglish
QualificationPhD
Awarding Institution
  • Vrije Universiteit Amsterdam
Supervisors/Advisors
  • Posthuma, Danielle, Supervisor
  • de Leeuw, Christiaan, Co-supervisor
Award date9 Mar 2026
DOIs
Publication statusPublished - 9 Mar 2026

Keywords

  • GWAS
  • genetics
  • gene prioritization
  • bioinformatics

Fingerprint

Dive into the research topics of 'Systematic Gene Prioritization in Genome-Wide Association Studies'. Together they form a unique fingerprint.

Cite this