Conditional Variational Autoencoder for Neural Machine Translation
December 11, 2018 ยท Declared Dead ยท ๐ arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Artidoro Pagnoni, Kevin Liu, Shangyan Li
arXiv ID
1812.04405
Category
cs.CL: Computation & Language
Citations
59
Venue
arXiv.org
Last Checked
4 months ago
Abstract
We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable of capturing. This is the first reported conditional variational model for text that meaningfully utilizes the latent variable without weakening the translation model.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age
HellaSwag: Can a Machine Really Finish Your Sentence?
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted