TY - GEN
T1 - Fast dual simulation processing of graph database queries
AU - Mennicke, S.
AU - Kalo, J.-C.
AU - Nagel, D.
AU - Kroll, H.
AU - Balke, W.-T.
PY - 2019/4/1
Y1 - 2019/4/1
N2 - © 2019 IEEE.Graph database query languages feature expressive yet computationally expensive pattern matching capabilities. Answering optional query clauses in SPARQL for instance renders the query evaluation problem immediately PSPACE-complete. Light-weight graph pattern matching relations, such as simulation, have recently been investigated as promising alternatives to more expensive query mechanisms like, e.g., computing subgraph isomorphism. Still, pattern matching alone lacks expressive query capabilities: graph patterns may be combined by usual inner joins. However, including more sophisticated operators is inevitable to make solutions more useful for emerging applications. In this paper we bridge this gap by introducing a new dual simulation process operating on SPARQL queries. In addition to supporting the full syntactic structure of SPARQL queries, it features polynomial-time pattern matching to compute an overapproximation of the query results. Moreover, to achieve running times competing with state-of-the-art database systems, we develop a novel algorithmic solution to dual simulation graph pattern matching, based on a system of inequalities that allows for several optimization heuristics. Finally, we achieve soundness of our process for SPARQL queries including UNION, AND and OPTIONAL operators not restricted to well-designed patterns. Our experiments on synthetic and real-world graph data promise a clear gain for graph database systems when incorporating the new dual simulation techniques.
AB - © 2019 IEEE.Graph database query languages feature expressive yet computationally expensive pattern matching capabilities. Answering optional query clauses in SPARQL for instance renders the query evaluation problem immediately PSPACE-complete. Light-weight graph pattern matching relations, such as simulation, have recently been investigated as promising alternatives to more expensive query mechanisms like, e.g., computing subgraph isomorphism. Still, pattern matching alone lacks expressive query capabilities: graph patterns may be combined by usual inner joins. However, including more sophisticated operators is inevitable to make solutions more useful for emerging applications. In this paper we bridge this gap by introducing a new dual simulation process operating on SPARQL queries. In addition to supporting the full syntactic structure of SPARQL queries, it features polynomial-time pattern matching to compute an overapproximation of the query results. Moreover, to achieve running times competing with state-of-the-art database systems, we develop a novel algorithmic solution to dual simulation graph pattern matching, based on a system of inequalities that allows for several optimization heuristics. Finally, we achieve soundness of our process for SPARQL queries including UNION, AND and OPTIONAL operators not restricted to well-designed patterns. Our experiments on synthetic and real-world graph data promise a clear gain for graph database systems when incorporating the new dual simulation techniques.
U2 - 10.1109/ICDE.2019.00030
DO - 10.1109/ICDE.2019.00030
M3 - Conference contribution
T3 - Proceedings - International Conference on Data Engineering
SP - 244
EP - 255
BT - Proceedings - 2019 IEEE 35th International Conference on Data Engineering, ICDE 2019
PB - IEEE Computer Society
T2 - 35th IEEE International Conference on Data Engineering, ICDE 2019
Y2 - 8 April 2019 through 11 April 2019
ER -