Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

October 09, 2024 Β· Declared Dead Β· πŸ› The Automatic Speaker Verification Spoofing Countermeasures Workshop (ASVspoof 2024)

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yi Zhu, Chirag Goel, Surya Koppisetti, Trang Tran, Ankur Kumar, Gaurav Bharaj arXiv ID 2410.07379 Category eess.AS: Audio & Speech Cross-listed cs.AI, cs.CL Citations 3 Venue The Automatic Speaker Verification Spoofing Countermeasures Workshop (ASVspoof 2024) Last Checked 3 months ago
Abstract
Audio deepfake detection is crucial to combat the malicious use of AI-synthesized speech. Among many efforts undertaken by the community, the ASVspoof challenge has become one of the benchmarks to evaluate the generalizability and robustness of detection models. In this paper, we present Reality Defender's submission to the ASVspoof5 challenge, highlighting a novel pretraining strategy which significantly improves generalizability while maintaining low computational cost during training. Our system SLIM learns the style-linguistics dependency embeddings from various types of bonafide speech using self-supervised contrastive learning. The learned embeddings help to discriminate spoof from bonafide speech by focusing on the relationship between the style and linguistics aspects. We evaluated our system on ASVspoof5, ASV2019, and In-the-wild. Our submission achieved minDCF of 0.1499 and EER of 5.5% on ASVspoof5 Track 1, and EER of 7.4% and 10.8% on ASV2019 and In-the-wild respectively.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Audio & Speech

Died the same way β€” πŸ‘» Ghosted