Weighted Shortest Common Supersequence Problem Revisited

Panagiotis Charalampopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski*, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

*Corresponding author for this work

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

A weighted string, also known as a position weight matrix, is a sequence of probability distributions over some alphabet. We revisit the Weighted Shortest Common Supersequence (WSCS) problem, introduced by Amir et al. [SPIRE 2011], that is, the SCS problem on weighted strings. In the WSCS problem, we are given two weighted strings (Formula presented) and (Formula presented) and a threshold (Formula presented) on probability, and we are asked to compute the shortest (standard) string S such that both (Formula presented) and (Formula presented) match subsequences of S (not necessarily the same) with probability at least (Formula presented). Amir et al. showed that this problem is NP-complete if the probabilities, including the threshold (Formula presented), are represented by their logarithms (encoded in binary). We present an algorithm that solves the WSCS problem for two weighted strings of length n over a constant-sized alphabet in (Formula presented) time. Notably, our upper bound matches known conditional lower bounds stating that the WSCS problem cannot be solved in (Formula presented) time or in (Formula presented) with time, where the (Formula presented) notation suppresses factors polynomial with respect to the instance size (with numeric values encoded in binary), unless there is a breakthrough improving upon long-standing upper bounds for fundamental NP-hard problems (CNF-SAT and Subset Sum, respectively). We also discover a fundamental difference between the WSCS problem and the Weighted Longest Common Subsequence (WLCS) problem, introduced by Amir et al. [JDA 2010]. We show that the WLCS problem cannot be solved in (Formula presented) time, for any function f(z), unless (Formula presented).

Original languageEnglish
Title of host publicationString Processing and Information Retrieval - 26th International Symposium, SPIRE 2019, Proceedings
EditorsNieves R. Brisaboa, Simon J. Puglisi
PublisherSpringer
Pages221-238
Number of pages18
ISBN (Print)9783030326852
DOIs
Publication statusPublished - 1 Jan 2019
Externally publishedYes
Event26th International Symposium on String Processing and Information Retrieval, SPIRE 2019 - Segovia, Spain
Duration: 7 Oct 20199 Oct 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11811 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Symposium on String Processing and Information Retrieval, SPIRE 2019
CountrySpain
CitySegovia
Period7/10/199/10/19

Fingerprint Dive into the research topics of 'Weighted Shortest Common Supersequence Problem Revisited'. Together they form a unique fingerprint.

Cite this