Generalised Implementation for Fixed-Length Approximate String Matching under Hamming Distance & Applications

Solon Pissis, Ahmad Retha

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Approximate string matching is the problem of finding all factors of a text t of length n with a distance at most k from a pattern x of length m ≤ n. Fixed-length approximate string matching is the problem of finding all factors of a text t of length n with a distance at most k from any factor of length h of a pattern x of length m ≤ n, where h ≤ m. It is thus a generalisation of approximate string matching and has numerous direct applications in molecular biology - motif extraction and circular sequence alignment, to name a few. MaxShiftM is a bit-parallel algorithm for fixed-length approximate string matching under the Hamming distance model with time complexity O (m[h/w]n), where w is the size of the computer word (Crochemore et al., 2010). An implementation of this algorithm is straightforward as long as the maximal length of alignments is less than or equal to w. In this article, our contribution is twofold: first, we propose a generalised implementation of MaxShiftM, that is, for any given h ≤ m, under the Hamming distance model, and second, we show how our implementation can be used to improve the accuracy and efficiency of multiple circular sequence alignment in terms of the inferred likelihood-based phylogenies.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages367-374
Number of pages8
ISBN (Electronic)0769555101, 9780769555102
DOIs
Publication statusPublished - 29 Sep 2015
Externally publishedYes
Event29th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015 - Hyderabad, India
Duration: 25 May 201529 May 2015

Publication series

NameProceedings - 2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015

Conference

Conference29th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2015
CountryIndia
CityHyderabad
Period25/05/1529/05/15

Keywords

  • algorithms on strings
  • approximate string matching
  • dynamic programming

Fingerprint

Dive into the research topics of 'Generalised Implementation for Fixed-Length Approximate String Matching under Hamming Distance & Applications'. Together they form a unique fingerprint.

Cite this