Abstract
Knowledge Graphs have been recognized as a valuable source for background information in many data mining, information retrieval, natural language processing, and knowledge extraction tasks. However, obtaining a suitable feature vector representation from RDF graphs is a challenging task. In this paper, we extend the RDF2Vec approach, which leverages language modeling techniques for unsupervised feature extraction from sequences of entities. We generate sequences by exploiting local information from graph substructures, harvested by graph walks, and learn latent numerical representations of entities in RDF graphs. We extend the way we compute feature vector representations by comparing twelve different edge weighting functions for performing biased walks on the RDF graph, in order to generate higher quality graph embeddings. We evaluate our approach using different machine learning, as well as entity and document modeling benchmark data sets, and show that the naive RDF2Vec approach can be improved by exploiting Biased Graph Walks.
Original language | English |
---|---|
Title of host publication | Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017 |
Publisher | Association for Computing Machinery |
ISBN (Electronic) | 9781450352253 |
DOIs | |
Publication status | Published - 19 Jun 2017 |
Externally published | Yes |
Event | 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017 - Amantea, Italy Duration: 19 Jun 2017 → 22 Jun 2017 |
Publication series
Name | ACM International Conference Proceeding Series |
---|---|
Volume | Part F129475 |
Conference
Conference | 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017 |
---|---|
Country/Territory | Italy |
City | Amantea |
Period | 19/06/17 → 22/06/17 |
Funding
Acknowledgments. Œe work presented in this paper has been partially funded by the Junior-professor funding programme of the Ministry of Science, Research and the Arts of the state of Baden-WürŠemberg (project ”Deep semantic models for high-end NLP application”), and by the German Research Foundation (DFG) under grant number PA 2373/1-1 (Mine@LOD). Œe implementation of our benchmarks was greatly aided by the work done on the Stanford Network Analysis Platform (SNAP).
Keywords
- Data mining
- Graph embeddings
- Linked open data