TY - GEN

T1 - Fast and simple computations using prefix tables under hamming and edit distance

AU - Barton, Carl

AU - Iliopoulos, Costas S.

AU - Pissis, Solon P.

AU - Smyth, William F.

PY - 2015/1/1

Y1 - 2015/1/1

N2 - In this article, we introduce a new and simple data structure, the prefix table under Hamming distance, and present two algorithms to compute it efficiently: one asymptotically fast; the other very fast on average and in practice. Because the latter approach avoids the computation of global data structures, such as the suffix array and the longest common prefix array, it yields algorithms much faster in practice than existing methods. We show how this data structure can be used to solve two string problems of interest: (a) approximate string matching under Hamming distance; and (b) longest approximate overlap under Hamming distance. Analogously, we introduce the prefix table under edit distance, and present an efficient algorithm for its computation. In the process, we also define the border array under both distance measures, and provide an algorithm for conversion between prefix tables and border arrays.

AB - In this article, we introduce a new and simple data structure, the prefix table under Hamming distance, and present two algorithms to compute it efficiently: one asymptotically fast; the other very fast on average and in practice. Because the latter approach avoids the computation of global data structures, such as the suffix array and the longest common prefix array, it yields algorithms much faster in practice than existing methods. We show how this data structure can be used to solve two string problems of interest: (a) approximate string matching under Hamming distance; and (b) longest approximate overlap under Hamming distance. Analogously, we introduce the prefix table under edit distance, and present an efficient algorithm for its computation. In the process, we also define the border array under both distance measures, and provide an algorithm for conversion between prefix tables and border arrays.

UR - http://www.scopus.com/inward/record.url?scp=84937414924&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937414924&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-19315-1_5

DO - 10.1007/978-3-319-19315-1_5

M3 - Conference contribution

AN - SCOPUS:84937414924

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 49

EP - 61

BT - Combinatorial Algorithms - 25th International Workshop, IWOCA 2014, Revised Selected Papers

A2 - Froncek, Dalibor

A2 - Kratochvíl, Jan

A2 - Miller, Mirka

PB - Springer Verlag

T2 - 25th International Workshop on Combinatorial Algorithms, IWOCA 2014

Y2 - 15 October 2014 through 17 October 2014

ER -