GOAT: efficient and robust identification of gene set enrichment

Frank Koopmans*

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Gene set enrichment analysis is foundational to the interpretation of high throughput biology. Identifying enriched Gene Ontology (GO) terms or disease-associated gene sets within a list of gene effect sizes that represent experimental outcomes is an everyday task in life science that crucially depends on robust and sensitive statistical tools. We here present GOAT, a parameter-free algorithm for gene set enrichment analysis of preranked gene lists. The algorithm can precompute null distributions from standardized gene scores, enabling enrichment testing of the GO database in one second. Validations using synthetic data show that estimated gene set p-values are well calibrated under the null hypothesis and invariant to gene list length and gene set size. Application to various real-world proteomics and gene expression studies demonstrates that GOAT identifies more significant GO terms as compared to current methods. GOAT is freely available as an R package and user-friendly online tool for gene set enrichment analyses that includes interactive data visualizations: https://ftwkoopmans.github.io/goat.

Original languageEnglish
Article number744
Pages (from-to)1-9
Number of pages9
JournalCommunications biology
Volume7
Early online date19 Jun 2024
DOIs
Publication statusPublished - 2024

Bibliographical note

Publisher Copyright:
© The Author(s) 2024.

Fingerprint

Dive into the research topics of 'GOAT: efficient and robust identification of gene set enrichment'. Together they form a unique fingerprint.

Cite this