๐
๐
Old Age
Quick Back-Translation for Unsupervised Machine Translation
December 01, 2023 ยท Entered Twilight ยท ๐ Conference on Empirical Methods in Natural Language Processing
Repo contents: .gitignore, CONTRIBUTING.md, LICENSE.TXT, MASS-unsupNMT, NOTICE.md, README.md, SECURITY.md, figs
Authors
Benjamin Brimacombe, Jiawei Zhou
arXiv ID
2312.00912
Category
cs.CL: Computation & Language
Cross-listed
cs.LG,
cs.PL
Citations
2
Venue
Conference on Empirical Methods in Natural Language Processing
Repository
https://github.com/bbrimacombe/Quick-Back-Translation
โญ 2
Last Checked
2 months ago
Abstract
The field of unsupervised machine translation has seen significant advancement from the marriage of the Transformer and the back-translation algorithm. The Transformer is a powerful generative model, and back-translation leverages Transformer's high-quality translations for iterative self-improvement. However, the Transformer is encumbered by the run-time of autoregressive inference during back-translation, and back-translation is limited by a lack of synthetic data efficiency. We propose a two-for-one improvement to Transformer back-translation: Quick Back-Translation (QBT). QBT re-purposes the encoder as a generative model, and uses encoder-generated sequences to train the decoder in conjunction with the original autoregressive back-translation step, improving data throughput and utilization. Experiments on various WMT benchmarks demonstrate that a relatively small number of refining steps of QBT improve current unsupervised machine translation models, and that QBT dramatically outperforms standard back-translation only method in terms of training efficiency for comparable translation qualities.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted