Multi-domain and Explainable Prediction of Changes in Web Vocabularies

Albert Meroño-Peñuela, Romana Pernisch, Christophe Guéret, Stefan Schlobach

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

151 Downloads (Pure)

Abstract

Web vocabularies (WV) have become a fundamental tool for structuring Web data: over 10 million sites use structured data formats and ontologies to markup content. Maintaining these vocabularies and keeping up with their changes are manual tasks with very limited automated support, impacting both publishers and users. Existing work shows that machine learning can be used to reliably predict vocabulary changes, but on specific domains (e.g. biomedicine) and with limited explanations on the impact of changes (e.g. their type, frequency, etc.). In this paper, we describe a framework that uses various supervised learning models to learn and predict changes in versioned vocabularies, independent of their domain. Using well-established results in ontology evolution we extract domain-agnostic and human-interpretable features and explain their influence on change predictability. Applying our method on 139 WV from 9 different domains, we find that ontology structural and instance data, the number of versions, and the release frequency highly correlate with predictability of change. These results can pave the way towards integrating predictive models into knowledge engineering practices and methods.

Original languageEnglish
Title of host publicationK-CAP 2021
Subtitle of host publicationProceedings of the 11th Knowledge Capture Conference
PublisherAssociation for Computing Machinery, Inc
Pages193-200
Number of pages8
ISBN (Electronic)9781450384575
DOIs
Publication statusPublished - Dec 2021
Event11th ACM International Conference on Knowledge Capture, K-CAP 2021 - Virtual, Online, United States
Duration: 2 Dec 20213 Dec 2021

Conference

Conference11th ACM International Conference on Knowledge Capture, K-CAP 2021
Country/TerritoryUnited States
CityVirtual, Online
Period2/12/213/12/21

Bibliographical note

Funding Information:
This work was partially supported by Elsevier's Discovery Lab, and the Computational Humanities Programme of the Royal Netherlands Academy of Arts and Sciences.

Funding Information:
This work was partially supported by Elsevier’s Discovery Lab, and the Computational Humanities Programme of the Royal Netherlands Academy of Arts and Sciences.

Publisher Copyright:
© 2021 ACM.

Funding

This work was partially supported by Elsevier's Discovery Lab, and the Computational Humanities Programme of the Royal Netherlands Academy of Arts and Sciences. This work was partially supported by Elsevier’s Discovery Lab, and the Computational Humanities Programme of the Royal Netherlands Academy of Arts and Sciences.

FundersFunder number
Elsevier's Discovery Lab
Elsevier’s Discovery Lab
Horizon 2020 Framework Programme101004746
Koninklijke Nederlandse Akademie van Wetenschappen

    Keywords

    • change modelling
    • ontology evolution
    • vocabulary change

    Fingerprint

    Dive into the research topics of 'Multi-domain and Explainable Prediction of Changes in Web Vocabularies'. Together they form a unique fingerprint.

    Cite this