Large-scale Speaker Retrieval on Random Speaker Variability Subspace

November 27, 2018 Β· Declared Dead Β· πŸ› Interspeech

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Suwon Shon, Younggun Lee, Taesu Kim arXiv ID 1811.10812 Category eess.AS: Audio & Speech Cross-listed cs.IR Citations 6 Venue Interspeech Last Checked 3 months ago
Abstract
This paper describes a fast speaker search system to retrieve segments of the same voice identity in the large-scale data. A recent study shows that Locality Sensitive Hashing (LSH) enables quick retrieval of a relevant voice in the large-scale data in conjunction with i-vector while maintaining accuracy. In this paper, we proposed Random Speaker-variability Subspace (RSS) projection to map a data into LSH based hash tables. We hypothesized that rather than projecting on completely random subspace without considering data, projecting on randomly generated speaker variability space would give more chance to put the same speaker representation into the same hash bins, so we can use less number of hash tables. Multiple RSS can be generated by randomly selecting a subset of speakers from a large speaker cohort. From the experimental result, the proposed approach shows 100 times and 7 times faster than the linear search and LSH, respectively
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Audio & Speech

Died the same way β€” πŸ‘» Ghosted