Non-native Speaker Verification for Spoken Language Assessment
September 30, 2019 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Linlin Wang, Yu Wang, Mark J. F. Gales
arXiv ID
1909.13695
Category
eess.AS: Audio & Speech
Cross-listed
cs.CL,
cs.SD
Citations
1
Venue
arXiv.org
Last Checked
3 months ago
Abstract
Automatic spoken language assessment systems are becoming more popular in order to handle increasing interests in second language learning. One challenge for these systems is to detect malpractice. Malpractice can take a range of forms, this paper focuses on detecting when a candidate attempts to impersonate another in a speaking test. This form of malpractice is closely related to speaker verification, but applied in the specific domain of spoken language assessment. Advanced speaker verification systems, which leverage deep-learning approaches to extract speaker representations, have been successfully applied to a range of native speaker verification tasks. These systems are explored for non-native spoken English data in this paper. The data used for speaker enrolment and verification is mainly taken from the BULATS test, which assesses English language skills for business. Performance of systems trained on relatively limited amounts of BULATS data, and standard large speaker verification corpora, is compared. Experimental results on large-scale test sets with millions of trials show that the best performance is achieved by adapting the imported model to non-native data. Breakdown of impostor trials across different first languages (L1s) and grades is analysed, which shows that inter-L1 impostors are more challenging for speaker verification systems.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Audio & Speech
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction
R.I.P.
π»
Ghosted
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
R.I.P.
π»
Ghosted
TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech
R.I.P.
π»
Ghosted
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
R.I.P.
π»
Ghosted
Utterance-level Aggregation For Speaker Recognition In The Wild
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted