## Abstract

A weighted sequence is a sequence of probability mass functions over a finite alphabet. A weighted index is a data structure constructed for a weighted sequence and a threshold [Formula presented] that, given a string pattern, reports all positions where it occurs in the weighted sequence with probability at least [Formula presented]. We present an O(nz)-time construction of an O(nz)-sized weighted index for a weighted sequence of length n that answers queries in optimal time. The previous solution by Amir et al. (2008) required O(nz^{2}logz) time and space. Our main tools are a construction of a family of ⌊z⌋ strings that carries the information about all the strings that occur in a weighted sequence and a more straightforward solution to so-called property indexing. We present applications of our weighted index, in particular in approximate and general scenarios that were introduced by Biswas et al. (2016), and provide its implementation.

Original language | English |
---|---|

Article number | 104462 |

Journal | Information and Computation |

Volume | 270 |

DOIs | |

Publication status | Published - Feb 2020 |

Externally published | Yes |

### Funding

Supported by ISF grants No. 824/17 and 1278/16 and by an ERC grant MPM under the EU's Horizon 2020 Research and Innovation Programme (grant No. 683064).Supported by the “Algorithms for text processing with errors and uncertainties” project carried out within the HOMING program of the Foundation for Polish Science no. POIR.04.04.00-00-24BA/16, co-financed by the European Union under the European Regional Development Fund.

Funders | Funder number |
---|---|

EU's Horizon 2020 research and innovation programme | |

Horizon 2020 Framework Programme | 683064 |

European Commission | |

European Research Council | |

Fundacja na rzecz Nauki Polskiej | POIR.04.04.00-00-24BA/16 |

Israel Science Foundation | 824/17, 1278/16 |

European Regional Development Fund |

## Keywords

- Position weight matrix (PWM)
- Property indexing
- Suffix tree
- Text indexing
- Weighted sequence