TY - GEN
T1 - Order-preserving incomplete suffix trees and order-preserving indexes
AU - Crochemore, Maxime
AU - Iliopoulos, Costas S.
AU - Kociumaka, Tomasz
AU - Kubica, Marcin
AU - Langiu, Alessio
AU - Pissis, Solon P.
AU - Radoszewski, Jakub
AU - Rytter, Wojciech
AU - Waleń, Tomasz
PY - 2013/1/1
Y1 - 2013/1/1
N2 - Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching: for a given text the goal is to find its factors having the same 'shape' as a given pattern. Known results include a linear-time algorithm for this problem (in case of polynomially-bounded alphabet) and a generalization to multiple patterns. We give an O (n log log n) time construction of an index that enables order-preserving pattern matching queries in time proportional to pattern length. The main component is a data structure being an incomplete suffix tree in the order-preserving setting. The tree can miss single letters related to branching at internal nodes. Such incompleteness results from the weakness of our so called weak character oracle. However, due to its weakness, such oracle can answer queries on-line in O (log log n) time using a sliding-window approach. For most of the applications such incomplete suffix-trees provide the same functional power as the complete ones. We also give an O (n log n/log log n) time algorithm constructing complete order-preserving suffix trees.
AB - Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching: for a given text the goal is to find its factors having the same 'shape' as a given pattern. Known results include a linear-time algorithm for this problem (in case of polynomially-bounded alphabet) and a generalization to multiple patterns. We give an O (n log log n) time construction of an index that enables order-preserving pattern matching queries in time proportional to pattern length. The main component is a data structure being an incomplete suffix tree in the order-preserving setting. The tree can miss single letters related to branching at internal nodes. Such incompleteness results from the weakness of our so called weak character oracle. However, due to its weakness, such oracle can answer queries on-line in O (log log n) time using a sliding-window approach. For most of the applications such incomplete suffix-trees provide the same functional power as the complete ones. We also give an O (n log n/log log n) time algorithm constructing complete order-preserving suffix trees.
UR - http://www.scopus.com/inward/record.url?scp=84893923818&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893923818&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-02432-5_13
DO - 10.1007/978-3-319-02432-5_13
M3 - Conference contribution
AN - SCOPUS:84893923818
SN - 9783319024318
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 84
EP - 95
BT - String Processing and Information Retrieval - 20th International Symposium, SPIRE 2013, Proceedings
PB - Springer Verlag
T2 - 20th International Symposium on String Processing and Information Retrieval, SPIRE 2013
Y2 - 7 October 2013 through 9 October 2013
ER -