Faster algorithms for 1-mappability of a sequence

Mai Alzamel*, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis, Jakub Radoszewski, Wing Kin Sung

*Corresponding author for this work

Research output: Contribution to JournalArticleAcademicpeer-review

Abstract

In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k=1. There exists an algorithm to solve this problem for k=1 requiring time O(mnlog⁡n/log⁡log⁡n) using space O(n). Here we present two new algorithms that require worst-case time O(mn) and O(nlog⁡nlog⁡log⁡n), respectively, and space O(n), thus greatly improving the previous result. Moreover, we present another algorithm that requires average-case time and space O(n) for integer alphabets of size σ if m=Ω(logσ⁡n). Notably, we show that this algorithm is generalizable for arbitrary k, requiring average-case time O(kn) and space O(n) if m=Ω(klogσ⁡n), assuming that the letters are independent and uniformly distributed random variables. Finally, we provide an experimental evaluation of our average-case algorithm demonstrating its competitiveness to the state-of-the-art implementation.

Original languageEnglish
Pages (from-to)2-12
Number of pages11
JournalTheoretical Computer Science
Volume812
DOIs
Publication statusPublished - 6 Apr 2020
Externally publishedYes

Keywords

  • Algorithms on strings
  • Hamming distance
  • Sequence mappability

Fingerprint Dive into the research topics of 'Faster algorithms for 1-mappability of a sequence'. Together they form a unique fingerprint.

Cite this