Extraction of semantic relations from medical literature based on semantic predicates and SVM

Xiaoli Zhao*, Shaofu Lin, Zhisheng Huang

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

The relationship of biomedical entity is the cornerstone of acquiring biomedical knowledge. It is of great significance to the construction of related databases in the biomedical field and the management of medical literature. How to quickly and accurately extract the required relationships of biomedical entity from massive unstructured literature is an important research. In order to improve accuracy, we use support vector machine (SVM) which is a machine learning algorithm based on feature vectors to extract relationships of entities. We extract the five main relationships in medical literature, including ISA, PART_OF, CAUSES, TREATS and DIAGNOSES. First of all, related topics are used to search medical literature from PubMed database, such as disease-drug, cause-disease. These documents are used as experimental data and then processed to form a corpus. In selection of features, the method of information gain is used to select the influential entities’ own features and entities’ context features. On this basis, semantic predicates are added as a feature to improve accuracy. The experimental results show that the accuracy of extraction is increased by 5%–10%. In the end, Resource Description Framework (RDF) is used to store extracted relationships from the corresponding documents, and it provides support for the subsequent retrieval of related documents.

Original languageEnglish
Title of host publicationHealth Information Science - 7th International Conference, HIS 2018, Proceedings
EditorsRui Zhou, Siuly Siuly, Hua Wang, Zhisheng Huang, Ickjai Lee, Wei Xiang
PublisherSpringer - Verlag
Pages17-24
Number of pages8
ISBN (Print)9783030010775
DOIs
Publication statusPublished - 2018
Event7th International Conference on Health Information Science, HIS 2018 - Cairns, QLD, Australia
Duration: 5 Oct 20187 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11148 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference7th International Conference on Health Information Science, HIS 2018
CountryAustralia
CityCairns, QLD
Period5/10/187/10/18

Keywords

  • Multi-classification
  • RDF
  • Relation extraction
  • Semantic technology
  • SVM

Fingerprint

Dive into the research topics of 'Extraction of semantic relations from medical literature based on semantic predicates and SVM'. Together they form a unique fingerprint.

Cite this