TY - GEN
T1 - Linear-time computation of prefix table for weighted strings
AU - Barton, Carl
AU - Pissis, Solon P.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - The prefix table of a string is one of the most fundamental data structures of algorithms on strings: it determines the longest factor at each position of the string that matches a prefix of the string. It can be computed in time linear with respect to the size of the string, and hence it can be used efficiently for locating patterns or for regularity searching in strings. A weighted string is a string in which a set of letters may occur at each position with respective occurrence probabilities. Weighted strings, also known as position weight matrices, naturally arise in many biological contexts; for example, they provide a method to realise approximation among occurrences of the same DNA segment. In this article, given a weighted string x of length n and a constant cumulative weight threshold 1/z, defined as the minimal probability of occurrence of factors in x, we present an ⱷ(n)-time algorithm for computing the prefix table of x.
AB - The prefix table of a string is one of the most fundamental data structures of algorithms on strings: it determines the longest factor at each position of the string that matches a prefix of the string. It can be computed in time linear with respect to the size of the string, and hence it can be used efficiently for locating patterns or for regularity searching in strings. A weighted string is a string in which a set of letters may occur at each position with respective occurrence probabilities. Weighted strings, also known as position weight matrices, naturally arise in many biological contexts; for example, they provide a method to realise approximation among occurrences of the same DNA segment. In this article, given a weighted string x of length n and a constant cumulative weight threshold 1/z, defined as the minimal probability of occurrence of factors in x, we present an ⱷ(n)-time algorithm for computing the prefix table of x.
UR - http://www.scopus.com/inward/record.url?scp=84945955307&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84945955307&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-23660-5_7
DO - 10.1007/978-3-319-23660-5_7
M3 - Conference contribution
AN - SCOPUS:84945955307
SN - 9783319236599
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 73
EP - 84
BT - Combinatorics on Words - 10th International Conference, WORDS 2015, Proceedings
A2 - Nowotka, Dirk
A2 - Manea, Florin
PB - Springer Verlag
T2 - 10th International Conference on Words, WORDS 2015
Y2 - 14 September 2015 through 17 September 2015
ER -