Detecting erroneous identity links on the web using network metrics

Joe Raad*, Wouter Beek, Frank van Harmelen, Nathalie Pernelle, Fatiha Saïs

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

325 Downloads (Pure)

Abstract

In the absence of a central naming authority on the Semantic Web, it is common for different datasets to refer to the same thing by different IRIs. Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Studies that date back as far as 2009, have observed that the owl:sameAs property is sometimes used incorrectly. In this paper, we show how network metrics such as the community structure of the owl:sameAs graph can be used in order to detect such possibly erroneous statements. One benefit of the here presented approach is that it can be applied to the network of owl:sameAs links itself, and does not rely on any additional knowledge. In order to illustrate its ability to scale, the approach is evaluated on the largest collection of identity links to date, containing over 558M owl:sameAs links scraped from the LOD Cloud.

Original languageEnglish
Title of host publicationThe Semantic Web – ISWC 2018 - 17th International Semantic Web Conference, 2018, Proceedings
EditorsMari Carmen Suárez-Figueroa, Valentina Presutti, Lucie-Aimee Kaffee, Elena Simperl, Marta Sabou, Denny Vrandecic, Irene Celino, Kalina Bontcheva
PublisherSpringer/Verlag
Pages391-407
Number of pages17
ISBN (Print)9783030006709
DOIs
Publication statusPublished - 2018
Event17th International Semantic Web Conference, ISWC 2018 - Monterey, United States
Duration: 8 Oct 201812 Oct 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11136 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Semantic Web Conference, ISWC 2018
Country/TerritoryUnited States
CityMonterey
Period8/10/1812/10/18

Funding

This work was partially conducted within the MaestroGraph project (612.001.553), funded by the Netherlands Organization for Scientific Research (NWO), and was partially supported by the Center for Data Science, funded by the IDEX Paris-Saclay, ANR-11-IDEX-0003-02. Acknowledgment. This work was partially conducted within the MaestroGraph project (612.001.553), funded by the Netherlands Organization for Scientific Research (NWO), and was partially supported by the Center for Data Science, funded by the IDEX Paris-Saclay, ANR-11-IDEX-0003-02.

FundersFunder number
Netherlands Organization for Scientific Research
Agence Nationale de la Recherche
Nederlandse Organisatie voor Wetenschappelijk Onderzoek612.001.553, ANR-11-IDEX-0003-02

    Keywords

    • Communities
    • Identity
    • Linked Open Data
    • Owl:sameAs

    Fingerprint

    Dive into the research topics of 'Detecting erroneous identity links on the web using network metrics'. Together they form a unique fingerprint.

    Cite this