Next-generation sequencing technologies have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads), during a single experiment, and with a much lower cost than previously possible. Due to the dramatic increase in the amount of data generated, a challenging task is to map (align) a set of reads to a reference genome. In this paper, we study a different version of this problem: mapping these reads to a dynamically changing genomic sequence. We propose a new practical algorithm, which employs a suitable data structure that takes into account potential dynamic effects (replacements, insertions, deletions) on the genomic sequence. The presented experimental results demonstrate that the proposed approach can be extended and applied to address the problem of mapping short reads to multiple related genomes.
- Next-generation sequencing
- String algorithms