kgbench: A Collection of Knowledge Graph Datasets for Evaluating Relational and Multimodal Machine Learning

Peter Bloem*, Xander Wilcke, Lucas van Berkel, Victor de Boer

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

749 Downloads (Pure)

Abstract

Graph neural networks and other machine learning models offer a promising direction for machine learning on relational and multimodal data. Until now, however, progress in this area is difficult to gauge. This is primarily due to a limited number of datasets with (a) a high enough number of labeled nodes in the test set for precise measurement of performance, and (b) a rich enough variety of multimodal information to learn from. We introduce a set of new benchmark tasks for node classification on RDF-encoded knowledge graphs. We focus primarily on node classification, since this setting cannot be solved purely by node embedding models. For each dataset, we provide test and validation sets of at least 1000 instances, with some over 10000. Each task can be performed in a purely relational manner, or with multimodal information. All datasets are packaged in a CSV format that is easily consumable in any machine learning environment, together with the original source data in RDF and pre-processing code for full provenance. We provide code for loading the data into numpy and pytorch. We compute performance for several baseline models.

Original languageEnglish
Title of host publicationThe Semantic Web
Subtitle of host publication18th International Conference, ESWC 2021, Virtual Event, June 6–10, 2021, Proceedings
EditorsRuben Verborgh, Katja Hose, Heiko Paulheim, Pierre-Antoine Champin, Maria Maleshkova, Oscar Corcho, Petar Ristoski, Mehwish Alam
PublisherSpringer Science and Business Media Deutschland GmbH
Pages614-630
Number of pages17
ISBN (Electronic)9783030773854
ISBN (Print)9783030773847
DOIs
Publication statusPublished - 2021
Event18th European Semantic Web Conference, ESWC 2021 - Virtual, Online
Duration: 6 Jun 202110 Jun 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12731 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Semantic Web Conference, ESWC 2021
CityVirtual, Online
Period6/06/2110/06/21

Bibliographical note

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • Knowledge graphs
  • Machine learning
  • Message passing models
  • Multimodal learning

Fingerprint

Dive into the research topics of 'kgbench: A Collection of Knowledge Graph Datasets for Evaluating Relational and Multimodal Machine Learning'. Together they form a unique fingerprint.

Cite this