Inductive WN18RR and FB15k-237

Dataset / Software

Description

This repository contains knowledge graphs based on the WN18RR and FB15k-237 datasets. We generate new training, validation, and test splits for the <em>inductive</em> setting, where some entities are removed from the training set. The splits are used in the experiments described in the paper "Inductive Entity Representations from Text via Link Prediction".

To generate inductive splits, we remove nodes so that no other node becomes isolated, and the number of edges of a particular relation type does not drop below 100.

The following are statistics for the datasets.

 

| | WN18RR-ind | FB15k-237-ind |
|-----------|------------|---------------|
| Relations | 11 | 237 |
| | Training |
| Entities | 32,755 | 11,633 |
| Triples | 69,585 | 215,082 |
| | Validation |
| Entities | 4,094 | 1,454 |
| Triples | 11,381 | 42,164 |
| | Test |
| Entities | 4,094 | 1,454 |
| Triples | 12,087 | 52,870 |

 

The splits for each dataset are called ind-train.tsv, ind-dev.tsv, and ind-test.tsv. We also include textual descriptions for each entity, as well as type information.

 

 
Date made available2021
PublisherZenodo

Cite this