On-line pattern matching on uncertain sequences and applications

Carl Barton, Chang Liu, Solon P. Pissis*

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

We study the fundamental problem of pattern matching in the case where the string data is weighted: for every position of the string and every letter of the alphabet a probability of occurrence for this letter at this position is given. Sequences of this type are commonly used to represent uncertain data. They are of particular interest in computational molecular biology as they can represent different kind of ambiguities in DNA sequences: distributions of SNPs in genomes populations; position frequency matrices of DNA binding profiles; or even sequencingrelated uncertainties. A weighted string may thus represent many different strings, each with probability of occurrence equal to the product of probabilities of its letters at subsequent positions. In this article, we present new average-case results on pattern matching on weighted strings and show how they are applied effectively in several biological contexts. A free open-source implementation of our algorithms is made available.

Original languageEnglish
Title of host publicationCombinatorial Optimization and Applications - 10th International Conference, COCOA 2016, Proceedings
EditorsMinming Li, Lusheng Wang, T-H. Hubert Chan
PublisherSpringer Verlag
Pages547-562
Number of pages16
ISBN (Print)9783319487489
DOIs
Publication statusPublished - 1 Jan 2016
Externally publishedYes
Event10th Annual International Conference on Combinatorial Optimization and Applications, COCOA 2016 - Hong Kong, China
Duration: 16 Dec 201618 Dec 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10043 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th Annual International Conference on Combinatorial Optimization and Applications, COCOA 2016
CountryChina
CityHong Kong
Period16/12/1618/12/16

Fingerprint

Dive into the research topics of 'On-line pattern matching on uncertain sequences and applications'. Together they form a unique fingerprint.

Cite this