LOD-A-lot: A single-file enabler for data science

Wouter Beek, Javier D. Ferńandez, Ruben Verborgh

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Many data scientists make use of Linked Open Data (LOD) as a huge interconnected knowledge base represented in RDF. However, the distributed nature of the information and the lack of a scalable approach to manage and consume such Big Semantic Data makes it difficult and expensive to conduct large-scale studies. As a consequence, most scientists restrict their analyses to one or two datasets (offen DBpedia) that contain at most hundreds of millions of triples. LOD-A-lot is a dataset that integrates a large portion (over 28 billion triples) of the LOD Cloud into a single ready-To-consume file that can be easily downloaded, shared and queried with a small memory footprint. .is paper shows there exists a wide collection of Data Science use cases that can be performed over such a LOD-A-lot file. For these use cases LOD-A-lot significantly reduces the cost and complexity of conducting Data Science.

Original languageEnglish
Title of host publicationProceedings of the 13th International Conference on Semantic Systems, SEMANTiCS 2017
PublisherAssociation for Computing Machinery
Pages181-184
Number of pages4
Volume2017-September
ISBN (Electronic)9781450352963
DOIs
Publication statusPublished - 11 Sept 2017
Event13th International Conference on Semantic Systems, SEMANTiCS 2017 - Amsterdam, Netherlands
Duration: 12 Sept 201713 Sept 2017

Conference

Conference13th International Conference on Semantic Systems, SEMANTiCS 2017
Country/TerritoryNetherlands
CityAmsterdam
Period12/09/1713/09/17

Fingerprint

Dive into the research topics of 'LOD-A-lot: A single-file enabler for data science'. Together they form a unique fingerprint.

Cite this