On-line pattern matching on similar texts

Roberto Grossi, Costas S. Iliopoulos, Chang Liu, Nadia Pisanti, Solon P. Pissis, Ahmad Retha, Giovanna Rosone, Fatima Vayani, Luca Versari

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Pattern matching on a set of similar texts has received much attention, especially recently, mainly due to its application in cataloguing human genetic variation. In particular, many different algorithms have been proposed for the off-line version of this problem; that is, constructing a compressed index for a set of similar texts in order to answer pattern matching queries efficiently. However, the on-line, more fundamental, version of this problem is a rather undeveloped topic. Solutions to the on-line version can be beneficial for a number of reasons; for instance, efficient on-line solutions can be used in combination with partial indexes as practical trade-offs. We make here an attempt to close this gap via proposing two efficient algorithms for this problem. Notably, one of the algorithms requires time linear in the size of the texts' representation, for short patterns. Furthermore, experimental results confirm our theoretical findings in practical terms.

Original languageEnglish
Title of host publication28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017
EditorsJakub Radoszewski, Juha Karkkainen, Jakub Radoszewski, Wojciech Rytter
PublisherSchloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
ISBN (Electronic)9783959770392
DOIs
Publication statusPublished - 1 Jul 2017
Externally publishedYes
Event28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017 - Warsaw, Poland
Duration: 4 Jul 20176 Jul 2017

Publication series

NameLeibniz International Proceedings in Informatics, LIPIcs
Volume78
ISSN (Print)1868-8969

Conference

Conference28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017
Country/TerritoryPoland
CityWarsaw
Period4/07/176/07/17

Funding

Roberto Grossi, Nadia Pisanti, and Giovanna Rosone are partially supported by the project UniPi PRA-2017-44 ("Advanced computational methodologies for the analysis of biomedical data"). Costas S. Iliopoulos is partially supported by the Onassis Foundation. Ahmad Retha is supported by the Graduate Teaching Scholarship scheme of the Department of Informatics at King's College London. Giovanna Rosone is partially supported by the project MIUR-SIR CMACBioSeq ("Combinatorial methods for analysis and compression of biological sequences") grant n. RBSI146R5L. Fatima Vayani is supported by an EPSRC Grant (Doctoral Training Grant #EP/M506357/1).

Keywords

  • Degenerate strings
  • Elastic-degenerate strings
  • On-line algorithms
  • Pattern matching
  • String algorithms

Fingerprint

Dive into the research topics of 'On-line pattern matching on similar texts'. Together they form a unique fingerprint.

Cite this