Do Large Language Models Pay Similar Attention Like Human Programmers When Generating Code?

June 02, 2023 Β· Declared Dead Β· πŸ› Proc. ACM Softw. Eng.

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Bonan Kou, Shengmai Chen, Zhijie Wang, Lei Ma, Tianyi Zhang arXiv ID 2306.01220 Category cs.SE: Software Engineering Cross-listed cs.HC, cs.LG Citations 21 Venue Proc. ACM Softw. Eng. Last Checked 4 months ago
Abstract
Large Language Models (LLMs) have recently been widely used for code generation. Due to the complexity and opacity of LLMs, little is known about how these models generate code. We made the first attempt to bridge this knowledge gap by investigating whether LLMs attend to the same parts of a task description as human programmers during code generation. An analysis of six LLMs, including GPT-4, on two popular code generation benchmarks revealed a consistent misalignment between LLMs' and programmers' attention. We manually analyzed 211 incorrect code snippets and found five attention patterns that can be used to explain many code generation errors. Finally, a user study showed that model attention computed by a perturbation-based method is often favored by human programmers. Our findings highlight the need for human-aligned LLMs for better interpretability and programmer trust.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Software Engineering

Died the same way β€” πŸ‘» Ghosted