On the Impact of sameAs on Schema Matching

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

In a large and decentralised knowledge representation system such as the Web of Data, it is common for data sets to overlap. In the absence of a central naming authority, semantic heterogeneity is inevitable as such overlapping contents are described using different schemas. To overcome this problem, a number of solutions have automated the integration of these data sets by matching their schemas. In this work we focus on a specific category of these solutions, which relies on the concepts’ extension for matching the schemas (i.e., instance-based methods). Rather than introducing a new approach for the task of schema matching, this work studies the effect of exploiting the semantics of owl:sameAs in such instance-based methods. For this empirical analysis, we investigate more than 900K concepts extracted from the Web, and make use of over 35B implicit identity assertions to study their impact. The experiments show that despite the growing doubts over their quality, exploiting owl:sameAs assertions extracted from the Web can improve instance-based schema matching techniques.
Original languageEnglish
Title of host publicationK-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture
PublisherACM
Pages77-84
Number of pages8
ISBN (Print)9781450370080
DOIs
Publication statusPublished - 19 Nov 2019

Fingerprint

Semantics
Knowledge representation
Experiments

Cite this

Raad, J., Acar, E., & Schlobach, S. (2019). On the Impact of sameAs on Schema Matching. In K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture (pp. 77-84). ACM. https://doi.org/10.1145/3360901.3364442
Raad, Joe ; Acar, Erman ; Schlobach, Stefan. / On the Impact of sameAs on Schema Matching. K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture. ACM, 2019. pp. 77-84
@inproceedings{57e529e1818c46f09e31a495db73123e,
title = "On the Impact of sameAs on Schema Matching",
abstract = "In a large and decentralised knowledge representation system such as the Web of Data, it is common for data sets to overlap. In the absence of a central naming authority, semantic heterogeneity is inevitable as such overlapping contents are described using different schemas. To overcome this problem, a number of solutions have automated the integration of these data sets by matching their schemas. In this work we focus on a specific category of these solutions, which relies on the concepts’ extension for matching the schemas (i.e., instance-based methods). Rather than introducing a new approach for the task of schema matching, this work studies the effect of exploiting the semantics of owl:sameAs in such instance-based methods. For this empirical analysis, we investigate more than 900K concepts extracted from the Web, and make use of over 35B implicit identity assertions to study their impact. The experiments show that despite the growing doubts over their quality, exploiting owl:sameAs assertions extracted from the Web can improve instance-based schema matching techniques.",
author = "Joe Raad and Erman Acar and Stefan Schlobach",
year = "2019",
month = "11",
day = "19",
doi = "10.1145/3360901.3364442",
language = "English",
isbn = "9781450370080",
pages = "77--84",
booktitle = "K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture",
publisher = "ACM",

}

Raad, J, Acar, E & Schlobach, S 2019, On the Impact of sameAs on Schema Matching. in K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture. ACM, pp. 77-84. https://doi.org/10.1145/3360901.3364442

On the Impact of sameAs on Schema Matching. / Raad, Joe; Acar, Erman; Schlobach, Stefan.

K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture. ACM, 2019. p. 77-84.

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - On the Impact of sameAs on Schema Matching

AU - Raad, Joe

AU - Acar, Erman

AU - Schlobach, Stefan

PY - 2019/11/19

Y1 - 2019/11/19

N2 - In a large and decentralised knowledge representation system such as the Web of Data, it is common for data sets to overlap. In the absence of a central naming authority, semantic heterogeneity is inevitable as such overlapping contents are described using different schemas. To overcome this problem, a number of solutions have automated the integration of these data sets by matching their schemas. In this work we focus on a specific category of these solutions, which relies on the concepts’ extension for matching the schemas (i.e., instance-based methods). Rather than introducing a new approach for the task of schema matching, this work studies the effect of exploiting the semantics of owl:sameAs in such instance-based methods. For this empirical analysis, we investigate more than 900K concepts extracted from the Web, and make use of over 35B implicit identity assertions to study their impact. The experiments show that despite the growing doubts over their quality, exploiting owl:sameAs assertions extracted from the Web can improve instance-based schema matching techniques.

AB - In a large and decentralised knowledge representation system such as the Web of Data, it is common for data sets to overlap. In the absence of a central naming authority, semantic heterogeneity is inevitable as such overlapping contents are described using different schemas. To overcome this problem, a number of solutions have automated the integration of these data sets by matching their schemas. In this work we focus on a specific category of these solutions, which relies on the concepts’ extension for matching the schemas (i.e., instance-based methods). Rather than introducing a new approach for the task of schema matching, this work studies the effect of exploiting the semantics of owl:sameAs in such instance-based methods. For this empirical analysis, we investigate more than 900K concepts extracted from the Web, and make use of over 35B implicit identity assertions to study their impact. The experiments show that despite the growing doubts over their quality, exploiting owl:sameAs assertions extracted from the Web can improve instance-based schema matching techniques.

U2 - 10.1145/3360901.3364442

DO - 10.1145/3360901.3364442

M3 - Conference contribution

SN - 9781450370080

SP - 77

EP - 84

BT - K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture

PB - ACM

ER -

Raad J, Acar E, Schlobach S. On the Impact of sameAs on Schema Matching. In K-CAP '19 Proceedings of the 10th International Conference on Knowledge Capture. ACM. 2019. p. 77-84 https://doi.org/10.1145/3360901.3364442