Contextual entity disambiguation in domains with weak identity criteria: Disambiguating golden age amsterdamers

Al Idrissou, Veruska Zamborlini, Frank Van Harmelen, Chiara Latronico

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

285 Downloads (Pure)

Abstract

Entity disambiguation is a widely investigated topic, and many matching algorithms have been proposed. However, this task has not yet been satisfactorily addressed when the domain of interest provides poor or incomplete data with little discriminating power. In these cases, the use of content fields such as name and date is not enough and the simple use of relations with other entities is not of much help when these related entities also need disambiguation before they can be used. Therefore, we propose an approach for the disambiguation of clustered resources using context (related entities that are also clustered) as evidence for reconciling matched entities. We test the proposed method on datasets of historical records from Amsterdam in the 17th century for which context is available, and we compare the results of the proposed approach to a gold standard generated by three experts, which we make available online. The results show that the proposed approach manages to meaningfully use context for isolating identity sub-clusters with higher quality by eliminating potentially false positive matches.

Original languageEnglish
Title of host publicationK-CAP 2019
Subtitle of host publicationProceedings of the 10th International Conference on Knowledge Capture
PublisherAssociation for Computing Machinery, Inc
Pages259-262
Number of pages4
ISBN (Electronic)9781450370080
DOIs
Publication statusPublished - Nov 2019
Event10th International Conference on Knowledge Capture, K-CAP 2019 - Marina Del Rey, United States
Duration: 19 Nov 201921 Nov 2019

Conference

Conference10th International Conference on Knowledge Capture, K-CAP 2019
Country/TerritoryUnited States
CityMarina Del Rey
Period19/11/1921/11/19

Funding

This work was supported by the European Union's 7th Framework Programme under the project RISIS (GA no. 313082) and by the Investment Subsidy NWO Large 2015-2016 under the Golden Agents project (no. 175.010.2015.009)

FundersFunder number
European Union's 7th Framework Programme
European Union’s 7th Framework Programme313082
Nederlandse Organisatie voor Wetenschappelijk Onderzoek175.010.2015.009

    Keywords

    • Data integration
    • Entity disambiguation
    • Entity reconciliation
    • Entity resolution
    • Linked data

    Fingerprint

    Dive into the research topics of 'Contextual entity disambiguation in domains with weak identity criteria: Disambiguating golden age amsterdamers'. Together they form a unique fingerprint.

    Cite this