Abstract
Deep learning has achieved remarkable success in a wide range of domains. However, it has not been comprehensively evaluated as a solution for the task of Chinese biomedical named entity recognition (Bio-NER). The traditional deep-learning approach for the Bio-NER task is usually based on the structure of recurrent neural networks (RNN) and only takes word embeddings into consideration, ignoring the value of character-level embeddings to encode the morphological and shape information. We propose an RNN-based approach, WCP-RNN, for the Chinese Bio-NER problem. Our method combines word embeddings and character embeddings to capture orthographic and lexicosemantic features. In addition, POS tags are involved as a priori word information to improve the final performance. The experimental results show our proposed approach outperforms the baseline method; the highest F-scores for subject and lesion detection tasks reach 90.36 and 90.48% with an increase of 3.10 and 2.60% compared with the baseline methods, respectively.
Original language | English |
---|---|
Pages (from-to) | 1450-1467 |
Number of pages | 18 |
Journal | Journal of supercomputing |
Volume | 76 |
Issue number | 3 |
Early online date | 16 Jan 2018 |
DOIs | |
Publication status | Published - Mar 2020 |
Keywords
- Bio-NER
- Chinese EMRs
- POS tags
- RNN-based model