emLam -- a Hungarian Language Modeling baseline
January 26, 2017 ยท Entered Twilight ยท ๐ arXiv.org
"Last commit was 5.0 years ago (โฅ5 year threshold)"
Evidence collected by the PWNC Scanner
Repo contents: .gitignore, LICENSE, MANIFEST.in, README.md, conf, emLam, requirements.txt, requirements_gpu.txt, scripts, setup.py
Authors
Dรกvid Mรกrk Nemeskey
arXiv ID
1701.07880
Category
cs.CL: Computation & Language
Citations
3
Venue
arXiv.org
Repository
https://github.com/DavidNemeskey/emLam
Last Checked
4 months ago
Abstract
This paper aims to make up for the lack of documented baselines for Hungarian language modeling. Various approaches are evaluated on three publicly available Hungarian corpora. Perplexity values comparable to models of similar-sized English corpora are reported. A new, freely downloadable Hungar- ian benchmark corpus is introduced.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age