TY - JOUR
T1 - WebPIE: A Web-scale parallel inference engine using MapReduce
T2 - A Web-scale Parallel Inference Engine using MapReduce
AU - Urbani, J.
AU - Kotoulas, S.
AU - Maassen, J.
AU - van Harmelen, F.A.H.
AU - Bal, H.E.
PY - 2012/1/1
Y1 - 2012/1/1
N2 - The large amount of Semantic Web data and its fast growth pose a significant computational challenge in performing efficient and scalable reasoning. On a large scale, the resources of single machines are no longer sufficient and we are required to distribute the process to improve performance. In this article, we propose a distributed technique to perform materialization under the RDFS and OWL ter Horst semantics using the MapReduce programming model. We will show that a straightforward implementation is not efficient and does not scale. Our technique addresses the challenge of distributed reasoning through a set of algorithms which, combined, significantly increase performance. We have implemented WebPIE (Web-scale Inference Engine) and we demonstrate its performance on a cluster of up to 64 nodes. We have evaluated our system using very large real-world datasets (Bio2RDF, LLD, LDSR) and the LUBM synthetic benchmark, scaling up to 100 billion triples. Results show that our implementation scales linearly and vastly outperforms current systems in terms of maximum data size and inference speed.
AB - The large amount of Semantic Web data and its fast growth pose a significant computational challenge in performing efficient and scalable reasoning. On a large scale, the resources of single machines are no longer sufficient and we are required to distribute the process to improve performance. In this article, we propose a distributed technique to perform materialization under the RDFS and OWL ter Horst semantics using the MapReduce programming model. We will show that a straightforward implementation is not efficient and does not scale. Our technique addresses the challenge of distributed reasoning through a set of algorithms which, combined, significantly increase performance. We have implemented WebPIE (Web-scale Inference Engine) and we demonstrate its performance on a cluster of up to 64 nodes. We have evaluated our system using very large real-world datasets (Bio2RDF, LLD, LDSR) and the LUBM synthetic benchmark, scaling up to 100 billion triples. Results show that our implementation scales linearly and vastly outperforms current systems in terms of maximum data size and inference speed.
KW - Distributed computing
KW - High performance
KW - MapReduce
KW - Reasoning
KW - Semantic Web
UR - http://www.scopus.com/inward/record.url?scp=84857059852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857059852&partnerID=8YFLogxK
U2 - 10.1016/j.websem.2011.05.004
DO - 10.1016/j.websem.2011.05.004
M3 - Article
SN - 1570-8268
VL - 10
SP - 59
EP - 75
JO - Journal of Web Semantics
JF - Journal of Web Semantics
ER -