A Survey of Recent DNN Architectures on the TIMIT Phone Recognition Task

June 19, 2018 · The Cartographer · 🏛 International Conference on Text, Speech and Dialogue

"No code URL or promise found in abstract"
"Title-pattern auto-detect: A Survey of Recent DNN Architectures on the TIMIT Phone Recognition Task"

Evidence collected by the PWNC Scanner

Authors Josef Michalek, Jan Vanek arXiv ID 1806.07974 Category cs.CL: Computation & Language Cross-listed cs.HC Citations 16 Venue International Conference on Text, Speech and Dialogue Last Checked 2 days ago

Abstract

In this survey paper, we have evaluated several recent deep neural network (DNN) architectures on a TIMIT phone recognition task. We chose the TIMIT corpus due to its popularity and broad availability in the community. It also simulates a low-resource scenario that is helpful in minor languages. Also, we prefer the phone recognition task because it is much more sensitive to an acoustic model quality than a large vocabulary continuous speech recognition (LVCSR) task. In recent years, many DNN published papers reported results on TIMIT. However, the reported phone error rates (PERs) were often much higher than a PER of a simple feed-forward (FF) DNN. That was the main motivation of this paper: To provide a baseline DNNs with open-source scripts to easily replicate the baseline results for future papers with lowest possible PERs. According to our knowledge, the best-achieved PER of this survey is better than the best-published PER to date.