Verified invertible lexer using regular expressions and DFAs
December 18, 2024 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Samuel Chassot, Viktor KunΔak
arXiv ID
2412.13581
Category
cs.PL: Programming Languages
Cross-listed
cs.SE
Citations
2
Venue
arXiv.org
Last Checked
4 months ago
Abstract
In this project, we explore the concept of invertibility applied to serialisation and lexing frameworks. Recall that, on one hand, serialisation is the process of taking a data structure and writing it to a bit array while parsing is the reverse operation, i.e., reading the bit array and constructing the data structure back. While lexing, on the other hand, is the process of reading a stream of characters and splitting them into tokens, by following a list of given rules. While used in different applications, both are similar in their abstract operation: they both take a list of simple characters and extract a more complex structure. Applications in which these two operations are used are different but they share a need for the invertibility of the process. For example, when tokenising a code file that was prettyprinted by a compiler, one would expect to get the same sequence of tokens. Similarly, when a spacecraft sends scientific data to the ground, one would expect the parsed data to be the same as the one serialised by the spacecraft. The idea of this project is to explore the idea of having a framework capable of generating parser/serialiser or lexer/prettyprinter pairs with a formally verified notion of invertibility. We first explore related works and frameworks. After that, we present our verified lexer framework developed in Scala and verified using the Stainless framework1. We explain the implementation choices we make and present the specifications and their proofs. The code of the lexer with the proofs is available on Github2. The main branch contains the regular expression (called regex from now on) matcher version and the verified Computable Languages while the dfa match branch contains the version using the DFA matcher.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Programming Languages
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
R.I.P.
π»
Ghosted
Glow: Graph Lowering Compiler Techniques for Neural Networks
R.I.P.
π»
Ghosted
Learnable Programming: Blocks and Beyond
R.I.P.
π»
Ghosted
Scenic: A Language for Scenario Specification and Scene Generation
R.I.P.
π»
Ghosted
Vandal: A Scalable Security Analysis Framework for Smart Contracts
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted