Abstract
Entity disambiguation is a widely investigated topic, and many matching algorithms have been proposed. However, this task has not yet been satisfactorily addressed when the domain of interest provides poor or incomplete data with little discriminating power. In these cases, the use of content fields such as name and date is not enough and the simple use of relations with other entities is not of much help when these related entities also need disambiguation before they can be used. Therefore, we propose an approach for the disambiguation of clustered resources using context (related entities that are also clustered) as evidence for reconciling matched entities. We test the proposed method on datasets of historical records from Amsterdam in the 17th century for which context is available, and we compare the results of the proposed approach to a gold standard generated by three experts, which we make available online. The results show that the proposed approach manages to meaningfully use context for isolating identity sub-clusters with higher quality by eliminating potentially false positive matches.
Original language | English |
---|---|
Title of host publication | K-CAP 2019 |
Subtitle of host publication | Proceedings of the 10th International Conference on Knowledge Capture |
Publisher | Association for Computing Machinery, Inc |
Pages | 259-262 |
Number of pages | 4 |
ISBN (Electronic) | 9781450370080 |
DOIs | |
Publication status | Published - Nov 2019 |
Event | 10th International Conference on Knowledge Capture, K-CAP 2019 - Marina Del Rey, United States Duration: 19 Nov 2019 → 21 Nov 2019 |
Conference
Conference | 10th International Conference on Knowledge Capture, K-CAP 2019 |
---|---|
Country/Territory | United States |
City | Marina Del Rey |
Period | 19/11/19 → 21/11/19 |
Funding
This work was supported by the European Union's 7th Framework Programme under the project RISIS (GA no. 313082) and by the Investment Subsidy NWO Large 2015-2016 under the Golden Agents project (no. 175.010.2015.009)
Funders | Funder number |
---|---|
European Union's 7th Framework Programme | |
European Union’s 7th Framework Programme | 313082 |
Nederlandse Organisatie voor Wetenschappelijk Onderzoek | 175.010.2015.009 |
Keywords
- Data integration
- Entity disambiguation
- Entity reconciliation
- Entity resolution
- Linked data