More complete resultset retrieval from large heterogeneous RDF sources

André Valdestilhas, Tommaso Soru, Muhammad Saleem

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Over the last years, the Web of Data has grown significantly. Various interfaces such as LOD Stats, LOD Laudromat, SPARQL endpoints provide access to the hundered of thousands of RDF datasets, representing billions of facts. These datasets are available in different formats such as raw data dumps and HDT files or directly accessible via SPARQL endpoints. Querying such large amount of distributed data is particularly challenging and many of these datasets cannot be directly queried using the SPARQL query language. In order to tackle these problems, we present WimuQ, an integrated query engine to execute SPARQL queries and retrieve results from large amount of heterogeneous RDF data sources. Presently, WimuQ is able to execute both federated and non-federated SPARQL queries over a total of 668,166 datasets from LOD Stats and LOD Laudromat as well as 559 active SPARQL endpoints. These data sources represent a total of 221.7 billion triples from more than 5 terabytes of information from datasets retrieved using the service "Where is My URI" (WIMU). Our evaluation on state-of-the-art real-data benchmarks shows that WimuQ retrieves more complete results for the benchmark queries.
Original languageEnglish
Title of host publicationK-CAP 2019 - Proceedings of the 10th International Conference on Knowledge Capture
PublisherAssociation for Computing Machinery, Inc
Pages223-230
ISBN (Electronic)9781450370080
DOIs
Publication statusPublished - 23 Sept 2019
Externally publishedYes
Event10th International Conference on Knowledge Capture, K-CAP 2019 - Marina Del Rey, United States
Duration: 19 Nov 201921 Nov 2019

Conference

Conference10th International Conference on Knowledge Capture, K-CAP 2019
Country/TerritoryUnited States
CityMarina Del Rey
Period19/11/1921/11/19

Funding

This work has been supported by the project LIMBO (Grant no. 19F2029I), OPAL (no. 19F2028A), KnowGraphs (no. 860801), and SOLIDE (no. 13N14456), CNPq Brazil under grants No. 201536/2014- 5 and Deutsche Forschungsgemeinschaft (DFG) - Project-number: 317044652. Special thanks to Thomas Riechert This work has been supported by the project LIMBO (Grant no. 19F2029I), OPAL (no. 19F2028A), KnowGraphs (no. 860801), and SOLIDE (no. 13N14456), CNPq Brazil under grants No. 201536/2014-5 and Deutsche Forschungsgemeinschaft (DFG) - Project-number: 317044652. Special thanks to Thomas Riechert.

FundersFunder number
OPAL19F2028A, 860801
SOLIDE13N14456
Thomas Riechert
Deutsche Forschungsgemeinschaft317044652
Conselho Nacional de Desenvolvimento Científico e Tecnológico201536/2014-5

    Fingerprint

    Dive into the research topics of 'More complete resultset retrieval from large heterogeneous RDF sources'. Together they form a unique fingerprint.

    Cite this