Predicting entity mentions in scientific literature

Yalung Zheng, Jon Ezeiza, Mehdi Farzanehpour, Jacopo Urbani

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Predicting which entities are likely to be mentioned in scientific articles is a task with significant academic and commercial value. For instance, it can lead to monetary savings if the articles are behind paywalls, or be used to recommend articles that are not yet available. Despite extensive prior work on entity prediction in Web documents, the peculiarities of scientific literature make it a unique scenario for this task. In this paper, we present an approach that uses a neural network to predict whether the (unseen) body of an article contains entities defined in domain-specific knowledge bases (KBs). The network uses features from the abstracts and the KB, and it is trained using open-access articles and authors’ prior works. Our experiments on biomedical literature show that our method is able to predict subsets of entities with high accuracy. As far as we know, our method is the first of its kind and is currently used in several commercial settings.

Original languageEnglish
Title of host publicationThe Semantic Web - 16th International Conference, ESWC 2019, Proceedings
EditorsAmrapali Zaveri, Alasdair J.G. Gray, Karl Hammar, Pascal Hitzler, Vanessa Lopez, Krzysztof Janowicz, Miriam Fernández, Armin Haller
PublisherSpringer Verlag
Pages379-393
Number of pages15
ISBN (Print)9783030213473
DOIs
Publication statusPublished - 1 Jan 2019
Event16th International Semantic Web Conference, ESWC 2019 - Portorož, Slovenia
Duration: 2 Jun 20196 Jun 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11503 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Semantic Web Conference, ESWC 2019
CountrySlovenia
CityPortorož
Period2/06/196/06/19

Fingerprint

Knowledge Base
Neural networks
Predict
High Accuracy
Experiments
Likely
Neural Networks
Scenarios
Subset
Prediction
Experiment

Cite this

Zheng, Y., Ezeiza, J., Farzanehpour, M., & Urbani, J. (2019). Predicting entity mentions in scientific literature. In A. Zaveri, A. J. G. Gray, K. Hammar, P. Hitzler, V. Lopez, K. Janowicz, M. Fernández, ... A. Haller (Eds.), The Semantic Web - 16th International Conference, ESWC 2019, Proceedings (pp. 379-393). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11503 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-21348-0_25
Zheng, Yalung ; Ezeiza, Jon ; Farzanehpour, Mehdi ; Urbani, Jacopo. / Predicting entity mentions in scientific literature. The Semantic Web - 16th International Conference, ESWC 2019, Proceedings. editor / Amrapali Zaveri ; Alasdair J.G. Gray ; Karl Hammar ; Pascal Hitzler ; Vanessa Lopez ; Krzysztof Janowicz ; Miriam Fernández ; Armin Haller. Springer Verlag, 2019. pp. 379-393 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{af026e99e2a64c5b88f65c04c849ed2f,
title = "Predicting entity mentions in scientific literature",
abstract = "Predicting which entities are likely to be mentioned in scientific articles is a task with significant academic and commercial value. For instance, it can lead to monetary savings if the articles are behind paywalls, or be used to recommend articles that are not yet available. Despite extensive prior work on entity prediction in Web documents, the peculiarities of scientific literature make it a unique scenario for this task. In this paper, we present an approach that uses a neural network to predict whether the (unseen) body of an article contains entities defined in domain-specific knowledge bases (KBs). The network uses features from the abstracts and the KB, and it is trained using open-access articles and authors’ prior works. Our experiments on biomedical literature show that our method is able to predict subsets of entities with high accuracy. As far as we know, our method is the first of its kind and is currently used in several commercial settings.",
author = "Yalung Zheng and Jon Ezeiza and Mehdi Farzanehpour and Jacopo Urbani",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/978-3-030-21348-0_25",
language = "English",
isbn = "9783030213473",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "379--393",
editor = "Amrapali Zaveri and Gray, {Alasdair J.G.} and Karl Hammar and Pascal Hitzler and Vanessa Lopez and Krzysztof Janowicz and Miriam Fern{\'a}ndez and Armin Haller",
booktitle = "The Semantic Web - 16th International Conference, ESWC 2019, Proceedings",
address = "Germany",

}

Zheng, Y, Ezeiza, J, Farzanehpour, M & Urbani, J 2019, Predicting entity mentions in scientific literature. in A Zaveri, AJG Gray, K Hammar, P Hitzler, V Lopez, K Janowicz, M Fernández & A Haller (eds), The Semantic Web - 16th International Conference, ESWC 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11503 LNCS, Springer Verlag, pp. 379-393, 16th International Semantic Web Conference, ESWC 2019, Portorož, Slovenia, 2/06/19. https://doi.org/10.1007/978-3-030-21348-0_25

Predicting entity mentions in scientific literature. / Zheng, Yalung; Ezeiza, Jon; Farzanehpour, Mehdi; Urbani, Jacopo.

The Semantic Web - 16th International Conference, ESWC 2019, Proceedings. ed. / Amrapali Zaveri; Alasdair J.G. Gray; Karl Hammar; Pascal Hitzler; Vanessa Lopez; Krzysztof Janowicz; Miriam Fernández; Armin Haller. Springer Verlag, 2019. p. 379-393 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11503 LNCS).

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Predicting entity mentions in scientific literature

AU - Zheng, Yalung

AU - Ezeiza, Jon

AU - Farzanehpour, Mehdi

AU - Urbani, Jacopo

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Predicting which entities are likely to be mentioned in scientific articles is a task with significant academic and commercial value. For instance, it can lead to monetary savings if the articles are behind paywalls, or be used to recommend articles that are not yet available. Despite extensive prior work on entity prediction in Web documents, the peculiarities of scientific literature make it a unique scenario for this task. In this paper, we present an approach that uses a neural network to predict whether the (unseen) body of an article contains entities defined in domain-specific knowledge bases (KBs). The network uses features from the abstracts and the KB, and it is trained using open-access articles and authors’ prior works. Our experiments on biomedical literature show that our method is able to predict subsets of entities with high accuracy. As far as we know, our method is the first of its kind and is currently used in several commercial settings.

AB - Predicting which entities are likely to be mentioned in scientific articles is a task with significant academic and commercial value. For instance, it can lead to monetary savings if the articles are behind paywalls, or be used to recommend articles that are not yet available. Despite extensive prior work on entity prediction in Web documents, the peculiarities of scientific literature make it a unique scenario for this task. In this paper, we present an approach that uses a neural network to predict whether the (unseen) body of an article contains entities defined in domain-specific knowledge bases (KBs). The network uses features from the abstracts and the KB, and it is trained using open-access articles and authors’ prior works. Our experiments on biomedical literature show that our method is able to predict subsets of entities with high accuracy. As far as we know, our method is the first of its kind and is currently used in several commercial settings.

UR - http://www.scopus.com/inward/record.url?scp=85066802232&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066802232&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-21348-0_25

DO - 10.1007/978-3-030-21348-0_25

M3 - Conference contribution

SN - 9783030213473

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 379

EP - 393

BT - The Semantic Web - 16th International Conference, ESWC 2019, Proceedings

A2 - Zaveri, Amrapali

A2 - Gray, Alasdair J.G.

A2 - Hammar, Karl

A2 - Hitzler, Pascal

A2 - Lopez, Vanessa

A2 - Janowicz, Krzysztof

A2 - Fernández, Miriam

A2 - Haller, Armin

PB - Springer Verlag

ER -

Zheng Y, Ezeiza J, Farzanehpour M, Urbani J. Predicting entity mentions in scientific literature. In Zaveri A, Gray AJG, Hammar K, Hitzler P, Lopez V, Janowicz K, Fernández M, Haller A, editors, The Semantic Web - 16th International Conference, ESWC 2019, Proceedings. Springer Verlag. 2019. p. 379-393. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-21348-0_25