Can Developers Prompt? A Controlled Experiment for Code Documentation Generation

August 01, 2024 · Declared Dead · 🏛 IEEE International Conference on Software Maintenance and Evolution

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Hans-Alexander Kruse, Tim Puhlfürß, Walid Maalej arXiv ID 2408.00686 Category cs.AI: Artificial Intelligence Cross-listed cs.HC, cs.SE Citations 13 Venue IEEE International Conference on Software Maintenance and Evolution Last Checked 4 months ago

Abstract

Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.