deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

June 23, 2015 ยท Declared Dead ยท ๐Ÿ› Annual Meeting of the Association for Computational Linguistics

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Michel Galley, Chris Brockett, Alessandro Sordoni, Yangfeng Ji, Michael Auli, Chris Quirk, Margaret Mitchell, Jianfeng Gao, Bill Dolan arXiv ID 1506.06863 Category cs.CL: Computation & Language Citations 159 Venue Annual Meeting of the Association for Computational Linguistics Last Checked 2 months ago
Abstract
We introduce Discriminative BLEU (deltaBLEU), a novel metric for intrinsic evaluation of generated text in tasks that admit a diverse range of possible outputs. Reference strings are scored for quality by human raters on a scale of [-1, +1] to weight multi-reference BLEU. In tasks involving generation of conversational responses, deltaBLEU correlates reasonably with human judgments and outperforms sentence-level and IBM BLEU in terms of both Spearman's rho and Kendall's tau.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted