Learning neural trans-dimensional random field language models with noise-contrastive estimation
October 30, 2017 ยท Declared Dead ยท ๐ IEEE International Conference on Acoustics, Speech, and Signal Processing
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Bin Wang, Zhijian Ou
arXiv ID
1710.10739
Category
cs.CL: Computation & Language
Cross-listed
stat.ML
Citations
19
Venue
IEEE International Conference on Acoustics, Speech, and Signal Processing
Last Checked
4 months ago
Abstract
Trans-dimensional random field language models (TRF LMs) where sentences are modeled as a collection of random fields, have shown close performance with LSTM LMs in speech recognition and are computationally more efficient in inference. However, the training efficiency of neural TRF LMs is not satisfactory, which limits the scalability of TRF LMs on large training corpus. In this paper, several techniques on both model formulation and parameter estimation are proposed to improve the training efficiency and the performance of neural TRF LMs. First, TRFs are reformulated in the form of exponential tilting of a reference distribution. Second, noise-contrastive estimation (NCE) is introduced to jointly estimate the model parameters and normalization constants. Third, we extend the neural TRF LMs by marrying the deep convolutional neural network (CNN) and the bidirectional LSTM into the potential function to extract the deep hierarchical features and bidirectionally sequential features. Utilizing all the above techniques enables the successful and efficient training of neural TRF LMs on a 40x larger training set with only 1/3 training time and further reduces the WER with relative reduction of 4.7% on top of a strong LSTM LM baseline.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age
HellaSwag: Can a Machine Really Finish Your Sentence?
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted