Overview of the ClinIQLink 2025 Shared Task on Medical Question-Answering

June 18, 2025 ยท The Cartographer ยท ๐Ÿ› Proceedings of the 24th Workshop on Biomedical Language Processing

๐Ÿ“š THE CARTOGRAPHER: The Cartographer
Survey/review paper โ€” maps the landscape rather than implementing a method.

"No code URL or promise found in abstract"
"Title-pattern auto-detect: Overview of the ClinIQLink 2025 Shared Task on Medical Question-Answering"

Evidence collected by the PWNC Scanner

Authors Brandon Colelough, Davis Bartels, Dina Demner-Fushman arXiv ID 2506.21597 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.IR Citations 0 Venue Proceedings of the 24th Workshop on Biomedical Language Processing Last Checked 5 days ago
Abstract
In this paper, we present an overview of ClinIQLink, a shared task, collocated with the 24th BioNLP workshop at ACL 2025, designed to stress-test large language models (LLMs) on medically-oriented question answering aimed at the level of a General Practitioner. The challenge supplies 4,978 expert-verified, medical source-grounded question-answer pairs that cover seven formats: true/false, multiple choice, unordered list, short answer, short-inverse, multi-hop, and multi-hop-inverse. Participating systems, bundled in Docker or Apptainer images, are executed on the CodaBench platform or the University of Maryland's Zaratan cluster. An automated harness (Task 1) scores closed-ended items by exact match and open-ended items with a three-tier embedding metric. A subsequent physician panel (Task 2) audits the top model responses.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago