Named Entity Recognition for Long COVID Biomedical Literature by Using Bert-BiLSTM-IDCNN-ATT-CRF Approach

Zongwang Han, Shaofu Lin*, Zhisheng Huang, Chaohui Guo

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

14 Downloads (Pure)

Abstract

In recent years, with the exploration of pathological mechanisms and treatments of Long COVID, there has been a dramatic increase in related scientific publications. Effective extraction of key information from these texts is of great importance for public health and research progress. In the Long COVID context, Named Entity Recognition (NER) can be used to identify disease names as well as symptoms, which can help to analyze the sequelae caused by COVID-19 and its relationship with other diseases. Distinguished from molecular biomedical text mining, which focuses on the identification of entities such as genes, proteins, and chemistries and their relationships, Long COVID text mining faces problems such as the lack of publicly labeled datasets and the heavy workload of manual annotation. Moreover, due to the strong domain characteristics of Long COVID relevant named entities, models and methods that have achieved great performance in the generic domain will have significantly degraded named entity recognition performance on this domain. Based on the above problems, we constructed a Long COVID literature abstract NER dataset (LNER) and proposed a Long COVID biomedical literature NER model Bert-BiLSTM-IDCNN-ATT-CRF (BBIAC). First, the BERT-BiLSTM-CRF model is constructed on the LNER dataset. Then, the inflated convolutional neural network (IDCNN) is added between the BiLSTM and the CRF layers to obtain the local features in the text sequences. Finally, feature enhancement is performed by fusing the features of global and local information using the attention mechanism. The experimental results show that the method proposed in this paper for Long COVID literature can accurately extract the characteristic information of Long COVID symptoms and diseases, and has better performance compared to other baseline models.

Original languageEnglish
Title of host publicationISAIMS '23
Subtitle of host publicationProceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science
PublisherAssociation for Computing Machinery
Pages1200-1205
Number of pages6
ISBN (Electronic)9798400708138
DOIs
Publication statusPublished - 2023
Event4th International Symposium on Artificial Intelligence for Medicine Science, ISAIMS 2023 - Hybrid, Chengdu, China
Duration: 20 Oct 202323 Oct 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference4th International Symposium on Artificial Intelligence for Medicine Science, ISAIMS 2023
Country/TerritoryChina
CityHybrid, Chengdu
Period20/10/2323/10/23

Bibliographical note

Publisher Copyright:
© 2023 ACM.

Keywords

  • Bert-BiLSTM-IDCNN-ATT-CRF
  • Biomedical literature
  • LNER Dataset
  • Long COVID
  • Named Entity Recognition

Fingerprint

Dive into the research topics of 'Named Entity Recognition for Long COVID Biomedical Literature by Using Bert-BiLSTM-IDCNN-ATT-CRF Approach'. Together they form a unique fingerprint.

Cite this