TY - GEN
T1 - Modelling and querying lists in RDF
T2 - 3rd Workshop on Querying and Benchmarking the Web of Data, QuWeDa 2019
AU - Daga, Enrico
AU - Meroño Peñuela, Albert
AU - Motta, Enrico
PY - 2019
Y1 - 2019
N2 - Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources. When publishing these lists in RDF, an important concern is making them easy to consume. Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it. However, a specific domain model can be implemented in different ways and vocabularies may provide alternative solutions. In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data. We take the case of RDF Lists and make the hypothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothesis). To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability. Finally, we derive good (and bad) practices on how to publish lists as linked open data. By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmarking linked data modelling solutions.
AB - Many Linked Data datasets model elements in their domains in the form of lists: a countable number of ordered resources. When publishing these lists in RDF, an important concern is making them easy to consume. Therefore, a well-known recommendation is to find an existing list modelling solution, and reuse it. However, a specific domain model can be implemented in different ways and vocabularies may provide alternative solutions. In this paper, we argue that a wrong decision could have a significant impact in terms of performance and, ultimately, the availability of the data. We take the case of RDF Lists and make the hypothesis that the efficiency of retrieving sequential linked data depends primarily on how they are modelled (triple-store invariance hypothesis). To demonstrate this, we survey different solutions for modelling sequences in RDF, and propose a pragmatic approach for assessing their impact on data availability. Finally, we derive good (and bad) practices on how to publish lists as linked open data. By doing this, we sketch the foundations of an empirical, task-oriented methodology for benchmarking linked data modelling solutions.
KW - Benchmarking methodology
KW - Linked Open Data
KW - RDF Lists
KW - SPARQL Benchmark
UR - http://www.scopus.com/inward/record.url?scp=85075300126&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075300126&partnerID=8YFLogxK
UR - https://ceur-ws.org/Vol-2496
M3 - Conference contribution
AN - SCOPUS:85075300126
T3 - CEUR Workshop Proceedings
SP - 21
EP - 36
BT - QuWeDa 2019 3rd Workshop on Querying and Benchmarking the Web of Data
A2 - Saleem, Muhammad
A2 - Hogan, Aidan
A2 - Usbeck, Ricardo
A2 - Ngonga Ngomo, Axel-Cyrille
A2 - Verborgh, Ruben
PB - CEUR-WS
Y2 - 26 October 2019 through 30 October 2019
ER -