Characterization of Cancer Types by Applying Machine Learning Methods on Blood RNA-Sequencing Data

Cem Bugra Alkan, Zerrin Isik

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

RNA-sequencing data is used to measure mRNA levels of genes based on tissue or blood samples. The critical changes in transcriptome can be observed more accurately by using RNA-sequencing data that eventually leads to understanding different behavior of the disease. In this study, different feature selection methods and machine learning algorithms are compared for the accurate classification of cancer types by using RNA-sequencing data from blood samples. In the analysis, seven cancer types were compared with each other and healthy samples. Correlation coefficient and information gain analysis are applied as feature selection methods. The selected genes are provided as the input of Support Vector Machine (SVM), Naïve Bayes (NB), and Random Forest (RF) methods. All machine learning methods were evaluated by applying 10-fold cross-validation. In the experiments, machine learning models achieved higher than 85% accuracy in the discrimination of hepatobiliary, lung, and pancreatic cancer types. When machine learning models are evaluated in terms of accuracy, RF and SVM were more successful than NB in many cases.
Original languageEnglish
Title of host publication3rd International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728137896
DOIs
Publication statusPublished - 1 Oct 2019
Externally publishedYes
Event3rd International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2019 - Ankara, Turkey
Duration: 11 Oct 201913 Oct 2019

Conference

Conference3rd International Symposium on Multidisciplinary Studies and Innovative Technologies, ISMSIT 2019
Country/TerritoryTurkey
CityAnkara
Period11/10/1913/10/19

Fingerprint

Dive into the research topics of 'Characterization of Cancer Types by Applying Machine Learning Methods on Blood RNA-Sequencing Data'. Together they form a unique fingerprint.

Cite this