TY - JOUR
T1 - Querying highly similar sequences
AU - Barton, Carl
AU - Giraud, Mathieu
AU - Iliopoulos, Costas S.
AU - Lecroq, Thierry
AU - Mouchard, Laurent
AU - Pissis, Solon P.
PY - 2013/1/1
Y1 - 2013/1/1
N2 - In this paper, we present a solution to the extreme similarity sequencing problem. The extreme similarity sequencing problem consists of finding occurrences of a pattern p in a set S0, S1, Sk, of sequences of equal length, where Si, for all 1=i=k, differs from S0 by a constant number of errors-around 10 in practice. We present an asymptotically fast O(n + occ logocc) time algorithm, as well as a practical O(nk/w) time algorithm for solving this problem, where n is the length of a sequence, occ is the number of candidate occurrences reported by our technique, w is the size of the machine word, and the total number of errors is bounded by k-the number of sequences.
AB - In this paper, we present a solution to the extreme similarity sequencing problem. The extreme similarity sequencing problem consists of finding occurrences of a pattern p in a set S0, S1, Sk, of sequences of equal length, where Si, for all 1=i=k, differs from S0 by a constant number of errors-around 10 in practice. We present an asymptotically fast O(n + occ logocc) time algorithm, as well as a practical O(nk/w) time algorithm for solving this problem, where n is the length of a sequence, occ is the number of candidate occurrences reported by our technique, w is the size of the machine word, and the total number of errors is bounded by k-the number of sequences.
KW - DNA sequencing
KW - Highly similar sequences
KW - Next-generation sequencing
KW - NGS
KW - Querying DNA sequences
KW - Similarity searching
UR - http://www.scopus.com/inward/record.url?scp=84874443266&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84874443266&partnerID=8YFLogxK
M3 - Article
C2 - 23428478
AN - SCOPUS:84874443266
SN - 1756-0756
VL - 6
SP - 119
EP - 130
JO - International Journal of Computational Biology and Drug Design
JF - International Journal of Computational Biology and Drug Design
IS - 1-2
ER -