TY - GEN
T1 - Drug-drug interaction prediction based on knowledge graph embeddings and convolutional-LSTM network
AU - Rezaul Karim, Md
AU - Cochez, Michael
AU - Jares, Joao Bosco
AU - Uddin, Mamtaz
AU - Beyan, Oya
AU - Decker, Stefan
PY - 2019/9/4
Y1 - 2019/9/4
N2 - Interference between pharmacological substances can cause serious medical injuries. Correctly predicting so-called drug-drug interactions (DDI) does not only reduce these cases but can also result in a reduction of drug development cost. Presently, most drug-related knowledge is the result of clinical evaluations and post marketing surveillance; resulting in a limited amount of information. Existing data-driven prediction approaches for DDIs typically rely on a single source of information, while using information from multiple sources would help improve predictions. Machine learning (ML) techniques are used, but the techniques are often unable to deal with skew in the data. Hence, we propose a new ML approach for predicting DDIs based on multiple data sources. For this task we use 12,000 drug features from DrugBank, PharmGKB, and KEGG drugs, which are integrated using Knowledge Graphs (KGs). To train our prediction model, we first embed the nodes in the graph using various embedding approaches. We found that the best performing combination was a ComplEx embedding method creating using PyTorch-BigGraph (PBG) with a Convolutional-LSTM network and classic machine learning based prediction models. The model averaging ensemble method of three best classifiers yields up to 0.94, 0.92, 0.80 for AUPR, F1-score, and MCC, respectively during 5-fold cross-validation tests.
AB - Interference between pharmacological substances can cause serious medical injuries. Correctly predicting so-called drug-drug interactions (DDI) does not only reduce these cases but can also result in a reduction of drug development cost. Presently, most drug-related knowledge is the result of clinical evaluations and post marketing surveillance; resulting in a limited amount of information. Existing data-driven prediction approaches for DDIs typically rely on a single source of information, while using information from multiple sources would help improve predictions. Machine learning (ML) techniques are used, but the techniques are often unable to deal with skew in the data. Hence, we propose a new ML approach for predicting DDIs based on multiple data sources. For this task we use 12,000 drug features from DrugBank, PharmGKB, and KEGG drugs, which are integrated using Knowledge Graphs (KGs). To train our prediction model, we first embed the nodes in the graph using various embedding approaches. We found that the best performing combination was a ComplEx embedding method creating using PyTorch-BigGraph (PBG) with a Convolutional-LSTM network and classic machine learning based prediction models. The model averaging ensemble method of three best classifiers yields up to 0.94, 0.92, 0.80 for AUPR, F1-score, and MCC, respectively during 5-fold cross-validation tests.
KW - Conv-LSTM network
KW - Drug-drug interactions
KW - Graph embeddings
KW - Knowledge graphs
KW - Linked data
KW - Model averaging ensemble
UR - http://www.scopus.com/inward/record.url?scp=85073143900&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073143900&partnerID=8YFLogxK
U2 - 10.1145/3307339.3342161
DO - 10.1145/3307339.3342161
M3 - Conference contribution
T3 - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
SP - 113
EP - 123
BT - ACM-BCB 2019 - Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
PB - Association for Computing Machinery, Inc
T2 - 10th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2019
Y2 - 7 September 2019 through 10 September 2019
ER -