All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection

July 28, 2023 ยท Declared Dead ยท ๐Ÿ› PKDD/ECML Workshops

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Daniele Mari, Davide Salvi, Paolo Bestagini, Simone Milani arXiv ID 2307.15555 Category cs.SD: Sound Cross-listed cs.CL, cs.CR, eess.AS Citations 5 Venue PKDD/ECML Workshops Last Checked 3 months ago
Abstract
Recent advances in deep learning and computer vision have made the synthesis and counterfeiting of multimedia content more accessible than ever, leading to possible threats and dangers from malicious users. In the audio field, we are witnessing the growth of speech deepfake generation techniques, which solicit the development of synthetic speech detection algorithms to counter possible mischievous uses such as frauds or identity thefts. In this paper, we consider three different feature sets proposed in the literature for the synthetic speech detection task and present a model that fuses them, achieving overall better performances with respect to the state-of-the-art solutions. The system was tested on different scenarios and datasets to prove its robustness to anti-forensic attacks and its generalization capabilities.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted