Skip to main navigation Skip to search Skip to main content

Apply Machine Learning to Predict Risk for Adolescent Depression in a Cohort of Kenyan Adolescents

  • Hyungrok Do
  • , Keng Yen Huang*
  • , Sabrina Cheng
  • , Leonard Njeru Njiru
  • , Shilla Mwaniga Mwavua
  • , Anne Atie Obondo
  • , Manasi Kumar
  • *Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Background: Adolescent depression is highly prevalent in low- and middle-income countries (LMICs). Identifying top key risk factors is necessary to inform effective prevention program design. Machine learning (ML) offers a powerful approach to analyze complex multidomain of data to identify the most relevant predictors and estimate risks for mental health problems. This paper applies ML to study risks for adolescent depression to enhance adolescent depression prevention efforts in LMICs. Methods: Six ML approaches (e.g., Explainable Boosting Machine, random forests, and XGBoost) were applied to study the risks of depression. Data were drawn from a digital health intervention study conducted in Kenya (year 2024–2025, n = 269). Multiple domains of childhood and adolescent adversity and stress experiences were used to predict adolescent depression (using PHQ9-A). Findings: We found that ML was a valuable approach in the early identification of adolescents at risk for depression. Among the six ML approaches applied, the random forest approach outperformed other ML approaches, especially when multiple domains of risks were included. We also found that childhood adversity or home adversity alone were not strong predictors for depression. Adding adolescent stress experiences and community school adversity experiences significantly improves the accuracy and predictability of depression. Using the top-15 and top-20 ranking factors, we achieved 74.8% and 75.1% accuracy in depression prediction, which was similar to the accuracy when all 49 adverse/stress factors were included in the predictive model (78.3%). Conclusions: Innovative ML and modern predictive modeling approaches have the potential to transform modern preventive mental health care by better utilizing multidomain data to identify individuals at risk for developing depression early and identify top risk factors (for targeted individuals and/or populations). Findings from ML can inform tailored intervention design to better mitigate risks in order to prevent depression problem development. They can also inform the better utilization of resources to target high-need cases and key determinants, which is particularly useful for LMICs and low-resource settings. This paper illustrates an example of how to move toward this direction. Future research is needed to validate the approach.

Original languageEnglish
Article number2620
Pages (from-to)1-19
Number of pages19
JournalHealthcare (Switzerland)
Volume13
Issue number20
Early online date17 Oct 2025
DOIs
Publication statusPublished - Oct 2025

Bibliographical note

This article belongs to the Special Issue Depression: Recognizing and Addressing Mental Health Challenges.

Publisher Copyright:
© 2025 by the authors.

Funding

Funding: The research was funded by grants from the US National Institute of Health (NIH)—grant numbers: R33MH124149: R21MH131041, R34MH137292.

Keywords

  • adolescents
  • adverse childhood experiences
  • depression
  • lower-middle-income country
  • machine learning
  • risk factors
  • stress

Fingerprint

Dive into the research topics of 'Apply Machine Learning to Predict Risk for Adolescent Depression in a Cohort of Kenyan Adolescents'. Together they form a unique fingerprint.

Cite this