A Myhill-Nerode Theorem for Generalized Automata, with Applications to Pattern Matching and Compression

February 13, 2023 ยท The Ethereal ยท ๐Ÿ› Symposium on Theoretical Aspects of Computer Science

๐Ÿ”ฎ THE ETHEREAL: The Ethereal
Pure theory โ€” exists on a plane beyond code

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Nicola Cotumaccio arXiv ID 2302.06506 Category cs.FL: Formal Languages Cross-listed cs.DS, cs.LO Citations 6 Venue Symposium on Theoretical Aspects of Computer Science Last Checked 2 months ago
Abstract
The model of generalized automata, introduced by Eilenberg in 1974, allows representing a regular language more concisely than conventional automata by allowing edges to be labeled not only with characters, but also strings. Giammaresi and Montalbano introduced a notion of determinism for generalized automata [STACS 1995]. While generalized deterministic automata retain many properties of conventional deterministic automata, the uniqueness of a minimal generalized deterministic automaton is lost. In the first part of the paper, we show that the lack of uniqueness can be explained by introducing a set $ \mathcal{W(A)} $ associated with a generalized automaton $ \mathcal{A} $. In this way, we derive for the first time a full Myhill-Nerode theorem for generalized automata, which contains the textbook Myhill-Nerode theorem for conventional automata as a degenerate case. In the second part of the paper, we show that the set $ \mathcal{W(A)} $ leads to applications for pattern matching and data compression. We show that a Wheeler generalized automata can be stored using $ \mathfrak{e} \log ฯƒ(1 + o(1)) + O(e) $ bits so that pattern matching queries can be solved in $ O(m \log \log ฯƒ) $ time, where $ \mathfrak{e} $ is the total length of all edge labels, $ e $ is the number of edges, $ ฯƒ$ is the size of the alphabet and $ m $ is the length of the pattern.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Formal Languages