MARVIN: Distributed reasoning over large-scale Semantic Web data

E. Oren, S. Kotoulas, G. Anadiotis, R.M. Siebes, A.C.M. ten Teije, F.A.H. van Harmelen

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap strategy and show that this model converges towards completeness. Within this strategy, we address the problem of making distributed reasoning scalable and load-balanced. We present SpeedDate, a routing strategy that combines data clustering with random exchanges. The random exchanges ensure load balancing, while the data clustering attempts to maximise efficiency. SpeedDate is compared against random and deterministic (DHT-like) approaches, on performance and load-balancing. We simulate parameters such as system size, data distribution, churn rate, and network topology. The results indicate that SpeedDate is near-optimally balanced, performs in the same order of magnitude as a DHT-like approach, and has an average throughput per node that scales with sqrt(i) for i items in the system. We evaluate our overall Marvin system for performance, scalability, load balancing and efficiency. © 2009 Elsevier B.V. All rights reserved.
Original languageEnglish
JournalJournal of Web Semantics
Volume7
Issue number4
DOIs
Publication statusPublished - 2009

Fingerprint

Semantic Web
Resource allocation
Scalability
Throughput
Topology
Processing

Cite this

@article{dcf9139489264b0793a7e4646dea2a19,
title = "MARVIN: Distributed reasoning over large-scale Semantic Web data",
abstract = "Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap strategy and show that this model converges towards completeness. Within this strategy, we address the problem of making distributed reasoning scalable and load-balanced. We present SpeedDate, a routing strategy that combines data clustering with random exchanges. The random exchanges ensure load balancing, while the data clustering attempts to maximise efficiency. SpeedDate is compared against random and deterministic (DHT-like) approaches, on performance and load-balancing. We simulate parameters such as system size, data distribution, churn rate, and network topology. The results indicate that SpeedDate is near-optimally balanced, performs in the same order of magnitude as a DHT-like approach, and has an average throughput per node that scales with sqrt(i) for i items in the system. We evaluate our overall Marvin system for performance, scalability, load balancing and efficiency. {\circledC} 2009 Elsevier B.V. All rights reserved.",
author = "E. Oren and S. Kotoulas and G. Anadiotis and R.M. Siebes and {ten Teije}, A.C.M. and {van Harmelen}, F.A.H.",
year = "2009",
doi = "10.1016/j.websem.2009.09.002",
language = "English",
volume = "7",
journal = "Journal of Web Semantics",
issn = "1570-8268",
publisher = "Elsevier",
number = "4",

}

MARVIN: Distributed reasoning over large-scale Semantic Web data. / Oren, E.; Kotoulas, S.; Anadiotis, G.; Siebes, R.M.; ten Teije, A.C.M.; van Harmelen, F.A.H.

In: Journal of Web Semantics, Vol. 7, No. 4, 2009.

Research output: Contribution to JournalArticleAcademicpeer-review

TY - JOUR

T1 - MARVIN: Distributed reasoning over large-scale Semantic Web data

AU - Oren, E.

AU - Kotoulas, S.

AU - Anadiotis, G.

AU - Siebes, R.M.

AU - ten Teije, A.C.M.

AU - van Harmelen, F.A.H.

PY - 2009

Y1 - 2009

N2 - Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap strategy and show that this model converges towards completeness. Within this strategy, we address the problem of making distributed reasoning scalable and load-balanced. We present SpeedDate, a routing strategy that combines data clustering with random exchanges. The random exchanges ensure load balancing, while the data clustering attempts to maximise efficiency. SpeedDate is compared against random and deterministic (DHT-like) approaches, on performance and load-balancing. We simulate parameters such as system size, data distribution, churn rate, and network topology. The results indicate that SpeedDate is near-optimally balanced, performs in the same order of magnitude as a DHT-like approach, and has an average throughput per node that scales with sqrt(i) for i items in the system. We evaluate our overall Marvin system for performance, scalability, load balancing and efficiency. © 2009 Elsevier B.V. All rights reserved.

AB - Many Semantic Web problems are difficult to solve through common divide-and-conquer strategies, since they are hard to partition. We present Marvin, a parallel and distributed platform for processing large amounts of RDF data, on a network of loosely coupled peers. We present our divide-conquer-swap strategy and show that this model converges towards completeness. Within this strategy, we address the problem of making distributed reasoning scalable and load-balanced. We present SpeedDate, a routing strategy that combines data clustering with random exchanges. The random exchanges ensure load balancing, while the data clustering attempts to maximise efficiency. SpeedDate is compared against random and deterministic (DHT-like) approaches, on performance and load-balancing. We simulate parameters such as system size, data distribution, churn rate, and network topology. The results indicate that SpeedDate is near-optimally balanced, performs in the same order of magnitude as a DHT-like approach, and has an average throughput per node that scales with sqrt(i) for i items in the system. We evaluate our overall Marvin system for performance, scalability, load balancing and efficiency. © 2009 Elsevier B.V. All rights reserved.

U2 - 10.1016/j.websem.2009.09.002

DO - 10.1016/j.websem.2009.09.002

M3 - Article

VL - 7

JO - Journal of Web Semantics

JF - Journal of Web Semantics

SN - 1570-8268

IS - 4

ER -