Inspecting state of the art performance and NLP metrics in image-based medical report generation

November 18, 2020 ยท Declared Dead ยท ๐Ÿ› LatinX in AI at Neural Information Processing Systems Conference 2020

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Pablo Pino, Denis Parra, Pablo Messina, Cecilia Besa, Sergio Uribe arXiv ID 2011.09257 Category cs.CL: Computation & Language Cross-listed cs.AI, cs.CV, cs.LG Citations 10 Venue LatinX in AI at Neural Information Processing Systems Conference 2020 Last Checked 4 months ago
Abstract
Several deep learning architectures have been proposed over the last years to deal with the problem of generating a written report given an imaging exam as input. Most works evaluate the generated reports using standard Natural Language Processing (NLP) metrics (e.g. BLEU, ROUGE), reporting significant progress. In this article, we contrast this progress by comparing state of the art (SOTA) models against weak baselines. We show that simple and even naive approaches yield near SOTA performance on most traditional NLP metrics. We conclude that evaluation methods in this task should be further studied towards correctly measuring clinical accuracy, ideally involving physicians to contribute to this end.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 9 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted