PennyLang: Pioneering LLM-Based Quantum Code Generation with a Novel PennyLane-Centric Dataset
March 04, 2025 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Abdul Basit, Nouhaila Innan, Muhammad Haider Asif, Minghao Shao, Muhammad Kashif, Alberto Marchisio, Muhammad Shafique
arXiv ID
2503.02497
Category
cs.SE: Software Engineering
Cross-listed
cs.AI,
quant-ph
Citations
11
Venue
arXiv.org
Last Checked
4 months ago
Abstract
Large Language Models (LLMs) offer powerful capabilities in code generation, natural language understanding, and domain-specific reasoning. Their application to quantum software development remains limited, in part because of the lack of high-quality datasets both for LLM training and as dependable knowledge sources. To bridge this gap, we introduce PennyLang, an off-the-shelf, high-quality dataset of 3,347 PennyLane-specific quantum code samples with contextual descriptions, curated from textbooks, official documentation, and open-source repositories. Our contributions are threefold: (1) the creation and open-source release of PennyLang, a purpose-built dataset for quantum programming with PennyLane; (2) a framework for automated quantum code dataset construction that systematizes curation, annotation, and formatting to maximize downstream LLM usability; and (3) a baseline evaluation of the dataset across multiple open-source models, including ablation studies, all conducted within a retrieval-augmented generation (RAG) pipeline. Using PennyLang with RAG substantially improves performance: for example, Qwen 7B's success rate rises from 8.7% without retrieval to 41.7% with full-context augmentation, and LLaMa 4 improves from 78.8% to 84.8%, while also reducing hallucinations and enhancing quantum code correctness. Moving beyond Qiskit-focused studies, we bring LLM-based tools and reproducible methods to PennyLane for advancing AI-assisted quantum development.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Software Engineering
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Microservices: yesterday, today, and tomorrow
π
π
The Cartographer
A Survey of Machine Learning for Big Code and Naturalness
R.I.P.
π»
Ghosted
An Overview on Smart Contracts: Challenges, Advances and Platforms
R.I.P.
π»
Ghosted
Slither: A Static Analysis Framework For Smart Contracts
R.I.P.
π»
Ghosted
ContractFuzzer: Fuzzing Smart Contracts for Vulnerability Detection
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted