An approach for feature selection with data modelling in LC-MS metabolomics

I. Plyushchenko, D. Shakhmatov, T. Bolotnik, T. Baygildiev, P.N. Nesterenko, I. Rodin

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

© The Royal Society of Chemistry.The data processing workflow for LC-MS based metabolomics study is suggested with signal drift correction, univariate analysis, supervised learning, feature selection and unsupervised modelling. The proposed approach requires only an annotation-free peak table and produces an extremely reduced set of the most relevant features together with validation via Receiver Operating Characteristic analysis for selected predictors, cross-validation and unsupervised projection. The presented study was initially optimised by its own experimental set and then was successfully tested by using 36 datasets from 21 publicly available metabolomics projects. The suggested workflow can be used for classification purposes in high dimensional metabolomics studies and as a first step in exploratory analysis, data projection, biomarker selection, data integration and fusion. This journal is
Original languageEnglish
Pages (from-to)3582-3591
JournalAnalytical Methods
Volume12
Issue number28
DOIs
Publication statusPublished - 28 Jul 2020
Externally publishedYes

Funding

This work was funded by the Russian Foundation for Basic Research (RFBR), according to the research project No. 19-33-90071. The authors thank George Varziev (CEO, InterAnalyt – General Distributor of Shimadzu – Russia) for providing the LC-MS instrument. In addition, the authors would like to thank Oleg Mayboroda (Dr Leiden University Medical Center), Boris Sarvin (Ph.D. candidate, Israel Institute of Technology), Andrey Stavrianidi (Ph.D., Lomonosov Moscow State University), Eliz-aveta Fedorova (Ph.D. candidate, IPCE RAS) and two anonymous reviewers for their useful comments regarding the manuscript.

FundersFunder number
Russian Foundation for Basic Research19-33-90071

    Fingerprint

    Dive into the research topics of 'An approach for feature selection with data modelling in LC-MS metabolomics'. Together they form a unique fingerprint.

    Cite this