STRIDE: Simple Type Recognition In Decompiled Executables

July 03, 2024 ยท Entered Twilight ยท ๐Ÿ› arXiv.org

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: LICENSE, README.md, assets, stride

Authors Harrison Green, Edward J. Schwartz, Claire Le Goues, Bogdan Vasilescu arXiv ID 2407.02733 Category cs.CR: Cryptography & Security Citations 5 Venue arXiv.org Repository https://github.com/hgarrereyn/STRIDE โญ 119 Last Checked 3 months ago
Abstract
Decompilers are widely used by security researchers and developers to reverse engineer executable code. While modern decompilers are adept at recovering instructions, control flow, and function boundaries, some useful information from the original source code, such as variable types and names, is lost during the compilation process. Our work aims to predict these variable types and names from the remaining information. We propose STRIDE, a lightweight technique that predicts variable names and types by matching sequences of decompiler tokens to those found in training data. We evaluate it on three benchmark datasets and find that STRIDE achieves comparable performance to state-of-the-art machine learning models for both variable retyping and renaming while being much simpler and faster. We perform a detailed comparison with two recent SOTA transformer-based models in order to understand the specific factors that make our technique effective. We implemented STRIDE in fewer than 1000 lines of Python and have open-sourced it under a permissive license at https://github.com/hgarrereyn/STRIDE.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Cryptography & Security