Complex machine-learning algorithms and multivariable logistic regression on par in the prediction of insufficient clinical response to methotrexate in rheumatoid arthritis

H.R. Gosselt, M.M.A. Verhoeven, M. Bulatović-ćalasan, P.M. Welsing, M.C.F.J. de Rotte, J.M.W. Hazes, F.P.J.G. Lafeber, M. Hoogendoorn, R. de Jonge

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

© 2021 by the authors. Li-censee MDPI, Basel, Switzerland.The goals of this study were to examine whether machine-learning algorithms outper-form multivariable logistic regression in the prediction of insufficient response to methotrexate (MTX); secondly, to examine which features are essential for correct prediction; and finally, to in-vestigate whether the best performing model specifically identifies insufficient responders to MTX (combination) therapy. The prediction of insufficient response (3-month Disease Activity Score 28-Erythrocyte-sedimentation rate (DAS28-ESR) > 3.2) was assessed using logistic regression, least absolute shrinkage and selection operator (LASSO), random forest, and extreme gradient boosting (XGBoost). The baseline features of 355 rheumatoid arthritis (RA) patients from the “treatment in the Rotterdam Early Arthritis CoHort” (tREACH) and the U-Act-Early trial were combined for analyses. The model performances were compared using area under the curve (AUC) of receiver operating characteristic (ROC) curves, 95% confidence intervals (95% CI), and sensitivity and specificity. Fi-nally, the best performing model following feature selection was tested on 101 RA patients starting tocilizumab (TCZ)-monotherapy. Logistic regression (AUC = 0.77 95% CI: 0.68–0.86) performed as well as LASSO (AUC = 0.76, 95% CI: 0.67–0.85), random forest (AUC = 0.71, 95% CI: 0.61 = 0.81), and XGBoost (AUC = 0.70, 95% CI: 0.61–0.81), yet logistic regression reached the highest sensitivity (81%). The most important features were baseline DAS28 (components). For all algorithms, models with six features performed similarly to those with 16. When applied to the TCZ-monotherapy group, logistic regression’s sensitivity significantly dropped from 83% to 69% (p = 0.03). In the current dataset, logistic regression performed equally well compared to machine-learning algorithms in the prediction of insufficient response to MTX. Models could be reduced to six features, which are more conducive for clinical implementation. Interestingly, the prediction model was specific to MTX (combination) therapy response.
Original languageEnglish
Article number44
Pages (from-to)1-12
JournalJournal of personalized medicine
Volume11
Issue number1
DOIs
Publication statusPublished - 1 Jan 2021

Funding

Funding: Erythrocyte-folate measurements in U-Act-Early were supported by Roche NL BV.

FundersFunder number
Roche NL BV

    Fingerprint

    Dive into the research topics of 'Complex machine-learning algorithms and multivariable logistic regression on par in the prediction of insufficient clinical response to methotrexate in rheumatoid arthritis'. Together they form a unique fingerprint.

    Cite this