Ownership in low-level intermediate representation

August 07, 2024 · Declared Dead · 🏛 Formal Methods in Computer-Aided Design

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Siddharth Priya, Arie Gurfinkel arXiv ID 2408.04043 Category cs.PL: Programming Languages Cross-listed cs.SE Citations 1 Venue Formal Methods in Computer-Aided Design Last Checked 4 months ago

Abstract

The concept of ownership in high level languages can aid both the programmer and the compiler to reason about the validity of memory operations. Previously, ownership semantics has been used successfully in high level automatic program verification to model a reference to data by a first order logic (FOL) representation of data instead of maintaining an address map. However, ownership semantics is not used in low level program verification. We have identified two challenges. First, ownership information is lost when a program is compiled to a low level intermediate representation (e.g., in LLVM IR). Second, pointers in low level programs point to bytes using an address map (e.g., in unsafe Rust) and thus the verification condition (VC) cannot always replace a pointer by its FOL abstraction. To remedy the situation, we develop ownership semantics for an LLVM like low level intermediate representation. Using these semantics, the VC can opportunistically model some memory accesses by a direct access of a pointer cache that stores byte representation of data. This scheme reduces instances where an address map must be maintained, especially for mostly safe programs that follow ownership semantics. For unsafe functionality, memory accesses are modelled by operations on an address map and we provide mechanisms to keep the address map and pointer cache in sync. We implement these semantics in SEABMC, a bit precise bounded model checker for LLVM. For evaluation, the source programs are assumed to be written in C. Since C does not have ownership built in, suitable macros are added that introduce and preserve ownership during translation to LLVM like IR for verification. This approach is evaluated on mature open source C code. For both handcrafted benchmarks and practical programs, we observe a speedup of $1.3x-5x$ during SMT solving.