Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition

November 06, 2018 · Declared Dead · 🏛 IEEE International Conference on Acoustics, Speech, and Signal Processing

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori arXiv ID 1811.02566 Category eess.AS: Audio & Speech Cross-listed cs.LG, cs.SD, eess.SP, stat.ML Citations 21 Venue IEEE International Conference on Acoustics, Speech, and Signal Processing Last Checked 2 months ago

Abstract

Recurrent neural networks (RNN) are at the core of modern automatic speech recognition (ASR) systems. In particular, long-short term memory (LSTM) recurrent neural networks have achieved state-of-the-art results in many speech recognition tasks, due to their efficient representation of long and short term dependencies in sequences of inter-dependent features. Nonetheless, internal dependencies within the element composing multidimensional features are weakly considered by traditional real-valued representations. We propose a novel quaternion long-short term memory (QLSTM) recurrent neural network that takes into account both the external relations between the features composing a sequence, and these internal latent structural dependencies with the quaternion algebra. QLSTMs are compared to LSTMs during a memory copy-task and a realistic application of speech recognition on the Wall Street Journal (WSJ) dataset. QLSTM reaches better performances during the two experiments with up to $2.8$ times less learning parameters, leading to a more expressive representation of the information.