TY - GEN
T1 - Property suffix array with applications
AU - Charalampopoulos, Panagiotis
AU - Iliopoulos, Costas S.
AU - Liu, Chang
AU - Pissis, Solon P.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - The suffix array is one of the most prevalent data structures for string indexing; it stores the lexicographically sorted list of suffixes of a given string. Its practical advantage compared to the suffix tree is space efficiency. In Property Indexing, we are given a string x of length n and a property Π, i.e. a set of Π -valid intervals over x. A suffix-tree-like index over these valid prefixes of suffixes of x can be built in time and space O(n). We show here how to directly build a suffix-array-like index, the Property Suffix Array (PSA), in time and space O(n). We mainly draw our motivation from weighted (probabilistic) sequences: sequences of probability distributions over a given alphabet. Given a probability threshold 1z, we say that a string p of length m matches a weighted sequence X of length n at starting position i if the product of probabilities of the letters of p at positions i, …, i+ m- 1 in X is at least 1z. Our algorithm for building the PSA can be directly applied to build an O(nz) -sized suffix-array-like index over X in time and space O(nz).
AB - The suffix array is one of the most prevalent data structures for string indexing; it stores the lexicographically sorted list of suffixes of a given string. Its practical advantage compared to the suffix tree is space efficiency. In Property Indexing, we are given a string x of length n and a property Π, i.e. a set of Π -valid intervals over x. A suffix-tree-like index over these valid prefixes of suffixes of x can be built in time and space O(n). We show here how to directly build a suffix-array-like index, the Property Suffix Array (PSA), in time and space O(n). We mainly draw our motivation from weighted (probabilistic) sequences: sequences of probability distributions over a given alphabet. Given a probability threshold 1z, we say that a string p of length m matches a weighted sequence X of length n at starting position i if the product of probabilities of the letters of p at positions i, …, i+ m- 1 in X is at least 1z. Our algorithm for building the PSA can be directly applied to build an O(nz) -sized suffix-array-like index over X in time and space O(nz).
UR - http://www.scopus.com/inward/record.url?scp=85045398695&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85045398695&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-77404-6_22
DO - 10.1007/978-3-319-77404-6_22
M3 - Conference contribution
AN - SCOPUS:85045398695
SN - 9783319774039
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 290
EP - 302
BT - LATIN 2018
A2 - Mosteiro, Miguel A.
A2 - Bender, Michael A.
A2 - Farach-Colton, Martin
PB - Springer Verlag
T2 - 13th International Symposium on Latin American Theoretical Informatics, LATIN 2018
Y2 - 16 April 2018 through 19 April 2018
ER -