DALEQ -- Explainable Equivalence for Java Bytecode

August 03, 2025 · Declared Dead · 🏛 International Conference on Automated Software Engineering

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Jens Dietrich, Behnaz Hassanshahi arXiv ID 2508.01530 Category cs.CR: Cryptography & Security Citations 1 Venue International Conference on Automated Software Engineering Last Checked 4 months ago

Abstract

The security of software builds has attracted increased attention in recent years in response to incidents like solarwinds and xz. Now, several companies including Oracle and Google rebuild open source projects in a secure environment and publish the resulting binaries through dedicated repositories. This practice enables direct comparison between these rebuilt binaries and the original ones produced by developers and published in repositories such as Maven Central. These binaries are often not bitwise identical; however, in most cases, the differences can be attributed to variations in the build environment, and the binaries can still be considered equivalent. Establishing such equivalence, however, is a labor-intensive and error-prone process. While there are some tools that can be used for this purpose, they all fall short of providing provenance, i.e. readable explanation of why two binaries are equivalent, or not. To address this issue, we present daleq, a tool that disassembles Java byte code into a relational database, and can normalise this database by applying datalog rules. Those databases can then be used to infer equivalence between two classes. Notably, equivalence statements are accompanied with datalog proofs recording the normalisation process. We demonstrate the impact of daleq in an industrial context through a large-scale evaluation involving 2,714 pairs of jars, comprising 265,690 class pairs. In this evaluation, daleq is compared to two existing bytecode transformation tools. Our findings reveal a significant reduction in the manual effort required to assess non-bitwise equivalent artifacts, which would otherwise demand intensive human inspection. Furthermore, the results show that daleq outperforms existing tools by identifying more artifacts rebuilt from the same code as equivalent, even when no behavioral differences are present.