๐
๐
Old Age
Follow the Wisdom of the Crowd: Effective Text Generation via Minimum Bayes Risk Decoding
November 14, 2022 ยท Entered Twilight ยท ๐ Annual Meeting of the Association for Computational Linguistics
Repo contents: .gitignore, LICENSE, README.md, data, fables, ground_truth, outputs, prompts, run_mbrd.py
Authors
Mirac Suzgun, Luke Melas-Kyriazi, Dan Jurafsky
arXiv ID
2211.07634
Category
cs.CL: Computation & Language
Cross-listed
cs.LG
Citations
52
Venue
Annual Meeting of the Association for Computational Linguistics
Repository
https://github.com/suzgunmirac/crowd-sampling
โญ 20
Last Checked
1 month ago
Abstract
In open-ended natural-language generation, existing text decoding methods typically struggle to produce text which is both diverse and high-quality. Greedy and beam search are known to suffer from text degeneration and linguistic diversity issues, while temperature, top-k, and nucleus sampling often yield diverse but low-quality outputs. In this work, we present crowd sampling, a family of decoding methods based on Bayesian risk minimization, to address this diversity-quality trade-off. Inspired by the principle of "the wisdom of the crowd," crowd sampling seeks to select a candidate from a pool of candidates that has the least expected risk (i.e., highest expected reward) under a generative model according to a given utility function. Crowd sampling can be seen as a generalization of numerous existing methods, including majority voting, and in practice, it can be used as a drop-in replacement for existing sampling methods. Extensive experiments show that crowd sampling delivers improvements of 3-7 ROUGE and BLEU points across a wide range of tasks, including summarization, data-to-text, translation, and textual style transfer, while achieving new state-of-the-art results on WebNLG and WMT'16.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted