Sparse Regular Expression Matching
July 10, 2019 Β· Declared Dead Β· π ACM-SIAM Symposium on Discrete Algorithms
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Philip Bille, Inge Li GΓΈrtz
arXiv ID
1907.04752
Category
cs.DS: Data Structures & Algorithms
Citations
6
Venue
ACM-SIAM Symposium on Discrete Algorithms
Last Checked
4 months ago
Abstract
A regular expression specifies a set of strings formed by single characters combined with concatenation, union, and Kleene star operators. Given a regular expression $R$ and a string $Q$, the regular expression matching problem is to decide if $Q$ matches any of the strings specified by $R$. Regular expressions are a fundamental concept in formal languages and regular expression matching is a basic primitive for searching and processing data. A standard textbook solution [Thompson, CACM 1968] constructs and simulates a nondeterministic finite automaton, leading to an $O(nm)$ time algorithm, where $n$ is the length of $Q$ and $m$ is the length of $R$. Despite considerable research efforts only polylogarithmic improvements of this bound are known. Recently, conditional lower bounds provided evidence for this lack of progress when Backurs and Indyk [FOCS 2016] proved that, assuming the strong exponential time hypothesis (SETH), regular expression matching cannot be solved in $O((nm)^{1-Ξ΅})$, for any constant $Ξ΅> 0$. Hence, the complexity of regular expression matching is essentially settled in terms of $n$ and $m$. In this paper, we take a new approach and introduce a \emph{density} parameter, $Ξ$, that captures the amount of nondeterminism in the NFA simulation on $Q$. The density is at most $nm+1$ but can be significantly smaller. Our main result is a new algorithm that solves regular expression matching in $$O\left(Ξ\log \log \frac{nm}Ξ +n + m\right)$$ time. This essentially replaces $nm$ with $Ξ$ in the complexity of regular expression matching. We complement our upper bound by a matching conditional lower bound that proves that we cannot solve regular expression matching in time $O(Ξ^{1-Ξ΅})$ for any constant $Ξ΅> 0$ assuming SETH.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted