Phonotactic Complexity and its Trade-offs

May 07, 2020 ยท Declared Dead ยท ๐Ÿ› Transactions of the Association for Computational Linguistics

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Tiago Pimentel, Brian Roark, Ryan Cotterell arXiv ID 2005.03774 Category cs.CL: Computation & Language Citations 41 Venue Transactions of the Association for Computational Linguistics Last Checked 2 months ago
Abstract
We present methods for calculating a measure of phonotactic complexity---bits per phoneme---that permits a straightforward cross-linguistic comparison. When given a word, represented as a sequence of phonemic segments such as symbols in the international phonetic alphabet, and a statistical model trained on a sample of word types from the language, we can approximately measure bits per phoneme using the negative log-probability of that word under the model. This simple measure allows us to compare the entropy across languages, giving insight into how complex a language's phonotactics are. Using a collection of 1016 basic concept words across 106 languages, we demonstrate a very strong negative correlation of -0.74 between bits per phoneme and the average length of words.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago

Died the same way โ€” ๐Ÿ‘ป Ghosted