Exploring End-to-End Techniques for Low-Resource Speech Recognition

July 02, 2018 ยท Declared Dead ยท ๐Ÿ› International Conference on Speech and Computer

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Vladimir Bataev, Maxim Korenevsky, Ivan Medennikov, Alexander Zatvornitskiy arXiv ID 1807.00868 Category cs.SD: Sound Cross-listed cs.CL, eess.AS Citations 9 Venue International Conference on Speech and Computer Last Checked 3 months ago
Abstract
In this work we present simple grapheme-based system for low-resource speech recognition using Babel data for Turkish spontaneous speech (80 hours). We have investigated different neural network architectures performance, including fully-convolutional, recurrent and ResNet with GRU. Different features and normalization techniques are compared as well. We also proposed CTC-loss modification using segmentation during training, which leads to improvement while decoding with small beam size. Our best model achieved word error rate of 45.8%, which is the best reported result for end-to-end systems using in-domain data for this task, according to our knowledge.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted