TY - GEN
T1 - Circular sequence comparison with q-grams
AU - Grossi, Roberto
AU - Iliopoulos, Costas S.
AU - Mercaş, Robert
AU - Pisanti, Nadia
AU - Pissis, Solon P.
AU - Retha, Ahmad
AU - Vayani, Fatima
PY - 2015/1/1
Y1 - 2015/1/1
N2 - Sequence comparison is a fundamental step in many important tasks in bioinformatics. Traditional algorithms for measuring approximation in sequence comparison are based on the notions of distance or similarity, and are generally computed through sequence alignment techniques. As circular genome structure is a common phenomenon in nature, a caveat of specialized alignment techniques for circular sequence comparison is that they are computationally expensive, requiring from superquadratic to cubic time in the length of the sequences. In this paper, we introduce a new distance measure based on q-grams, and show how it can be computed efficiently for circular sequence comparison. Experimental results, using real and synthetic data, demonstrate orders-of-magnitude superiority of our approach in terms of efficiency, while maintaining an accuracy very competitive to the state of the art.
AB - Sequence comparison is a fundamental step in many important tasks in bioinformatics. Traditional algorithms for measuring approximation in sequence comparison are based on the notions of distance or similarity, and are generally computed through sequence alignment techniques. As circular genome structure is a common phenomenon in nature, a caveat of specialized alignment techniques for circular sequence comparison is that they are computationally expensive, requiring from superquadratic to cubic time in the length of the sequences. In this paper, we introduce a new distance measure based on q-grams, and show how it can be computed efficiently for circular sequence comparison. Experimental results, using real and synthetic data, demonstrate orders-of-magnitude superiority of our approach in terms of efficiency, while maintaining an accuracy very competitive to the state of the art.
UR - http://www.scopus.com/inward/record.url?scp=84947763685&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84947763685&partnerID=8YFLogxK
U2 - 10.1007/978-3-662-48221-6_15
DO - 10.1007/978-3-662-48221-6_15
M3 - Conference contribution
AN - SCOPUS:84947763685
SN - 9783662482209
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 203
EP - 216
BT - Algorithms in Bioinformatics - 15th International Workshop, WABI 2015, Proceedings
A2 - Pop, Mihai
A2 - Touzet, Hélène
PB - Springer Verlag
T2 - 15th International Workshop on Algorithms in Bioinformatics, WABI 2015
Y2 - 10 September 2015 through 12 September 2015
ER -