The role of knowledge in determining identity of long-tail entities

Filip Ilievski*, Eduard Hovy, Piek Vossen, Stefan Schlobach, Qizhe Xie

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

274 Downloads (Pure)

Abstract

The NIL entities do not have an accessible representation, which means that their identity cannot be established through traditional disambiguation. Consequently, they have received little attention in entity linking systems and tasks so far. Given the non-redundancy of knowledge on NIL entities, the lack of frequency priors, their potentially extreme ambiguity, and numerousness, they form an extreme class of long-tail entities and pose a great challenge for state-of-the-art systems. In this paper, we investigate the role of knowledge when establishing the identity of NIL entities mentioned in text. What kind of knowledge can be applied to establish the identity of NILs? Can we potentially link to them at a later point? How to capture implicit knowledge and fill knowledge gaps in communication? We formulate and test hypotheses to provide insights to these questions. Due to the unavailability of instance-level knowledge, we propose to enrich the locally extracted information with profiling models that rely on background knowledge in Wikidata. We describe and implement two profiling machines based on state-of-the-art neural models. We evaluate their intrinsic behavior and their impact on the task of determining identity of NIL entities.

Original languageEnglish
Article number100565
Pages (from-to)1-18
Number of pages18
JournalJournal of Web Semantics
Volume61-62
DOIs
Publication statusPublished - Mar 2020

Funding

The research reported in this paper has been funded by the Netherlands Organisation for Scientific Research (NWO) via the Spinoza fund.

Keywords

  • Knowledge-based completion
  • Long-tail entities
  • NIL clustering

Fingerprint

Dive into the research topics of 'The role of knowledge in determining identity of long-tail entities'. Together they form a unique fingerprint.

Cite this