HELO-APR: Enhancing Low-Resource Program Repair through Cross-Lingual Knowledge Transfer

April 18, 2026 · Grace Period · + Add venue

Authors Zhipeng Wang, Boyang Yang, Yidong Wan, Liuye Guo, You Lv, Tao Zheng, Zhuowei Wang, Tieke He arXiv ID 2604.17016 Category cs.SE: Software Engineering Citations 0

Abstract

Large Language Models (LLMs) perform well on automatic program repair (APR) for high-resource programming languages (HRPLs), but their effectiveness drops sharply in low-resource programming languages (LRPLs), due to a lack of sufficient verified buggy-fixed pairs for APR training. To address this challenge, we propose HELO-APR (High-resource Enabled LOw-resource APR), a two-stage APR framework that enables cross-lingual transfer of repair knowledge from HRPLs to LRPLs. HELO-APR (1) constructs high-quality LRPL training data by synthesizing LRPL buggy-fixed pairs from HRPL counterparts, preserving defect type consistency while ensuring the synthesized code is idiomatic, and then (2) adopts a curriculum learning strategy that progressively performs HRPL repair learning, cross-lingual repair alignment, and LRPL repair adaptation, improving repair effectiveness in LRPLs. Using C++ as the source HRPL and Ruby and Rust as the target LRPLs, experiments on xCodeEval show that HELO-APR consistently outperforms strong baselines, increasing Pass@1 from 31.32% to 48.65% on DeepSeek-Coder-6.7B and from 1.67% to 11.97% on CodeLlama-7B, while improving syntactic validity by raising the average target compilation rate on CodeLlama from 49.77% to 91.98%. On Defects4Ruby, HELO-APR increases BLEU-4 from 61.20 to 66.79 and ROUGE-1 from 76.76 to 83.59 on CodeLlama-7B, indicating higher similarity to developer patches in real-world settings. Finally, we conduct ablation studies to assess the necessity of each core component. These results suggest that verified cross-lingual supervision provides a reusable approach for improving LLM-based repair in low-resource languages.