The role of knowledge in determining identity of long-tail entities

Filip Ilievski*, Eduard Hovy, Piek Vossen, Stefan Schlobach, Qizhe Xie

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

The NIL entities do not have an accessible representation, which means that their identity cannot be established through traditional disambiguation. Consequently, they have received little attention in entity linking systems and tasks so far. Given the non-redundancy of knowledge on NIL entities, the lack of frequency priors, their potentially extreme ambiguity, and numerousness, they form an extreme class of long-tail entities and pose a great challenge for state-of-the-art systems. In this paper, we investigate the role of knowledge when establishing the identity of NIL entities mentioned in text. What kind of knowledge can be applied to establish the identity of NILs? Can we potentially link to them at a later point? How to capture implicit knowledge and fill knowledge gaps in communication? We formulate and test hypotheses to provide insights to these questions. Due to the unavailability of instance-level knowledge, we propose to enrich the locally extracted information with profiling models that rely on background knowledge in Wikidata. We describe and implement two profiling machines based on state-of-the-art neural models. We evaluate their intrinsic behavior and their impact on the task of determining identity of NIL entities.

Original languageEnglish
Article number100565
JournalJournal of Web Semantics
DOIs
Publication statusAccepted/In press - 1 Jan 2020

    Fingerprint

Keywords

  • Knowledge-based completion
  • Long-tail entities
  • NIL clustering

Cite this