Computing MEMs and Relatives on Repetitive Text Collections
October 18, 2022 Β· Declared Dead Β· π ACM Trans. Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Gonzalo Navarro
arXiv ID
2210.09914
Category
cs.DS: Data Structures & Algorithms
Citations
7
Venue
ACM Trans. Algorithms
Last Checked
4 months ago
Abstract
We consider the problem of computing the Maximal Exact Matches (MEMs) of a given pattern $P[1 .. m]$ on a large repetitive text collection $T[1 .. n]$, which is represented as a (hopefully much smaller) run-length context-free grammar of size $g_{rl}$. We show that the problem can be solved in time $O(m^2 \log^Ξ΅n)$, for any constant $Ξ΅> 0$, on a data structure of size $O(g_{rl})$. Further, on a locally consistent grammar of size $O(Ξ΄\log\frac{n}Ξ΄)$, the time decreases to $O(m\log m(\log m + \log^Ξ΅n))$. The value $Ξ΄$ is a function of the substring complexity of $T$ and $Ξ©(Ξ΄\log\frac{n}Ξ΄)$ is a tight lower bound on the compressibility of repetitive texts $T$, so our structure has optimal size in terms of $n$ and $Ξ΄$. We extend our results to several related problems, such as finding $k$-MEMs, MUMs, rare MEMs, and applications.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted