Audios Don't Lie: Multi-Frequency Channel Attention Mechanism for Audio Deepfake Detection

December 12, 2024 · Declared Dead · 🏛 arXiv.org

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yangguang Feng arXiv ID 2412.09467 Category cs.SD: Sound Cross-listed cs.CL, eess.AS Citations 1 Venue arXiv.org Last Checked 4 months ago

Abstract

With the rapid development of artificial intelligence technology, the application of deepfake technology in the audio field has gradually increased, resulting in a wide range of security risks. Especially in the financial and social security fields, the misuse of deepfake audios has raised serious concerns. To address this challenge, this study proposes an audio deepfake detection method based on multi-frequency channel attention mechanism (MFCA) and 2D discrete cosine transform (DCT). By processing the audio signal into a melspectrogram, using MobileNet V2 to extract deep features, and combining it with the MFCA module to weight different frequency channels in the audio signal, this method can effectively capture the fine-grained frequency domain features in the audio signal and enhance the Classification capability of fake audios. Experimental results show that compared with traditional methods, the model proposed in this study shows significant advantages in accuracy, precision,recall, F1 score and other indicators. Especially in complex audio scenarios, this method shows stronger robustness and generalization capabilities and provides a new idea for audio deepfake detection and has important practical application value. In the future, more advanced audio detection technologies and optimization strategies will be explored to further improve the accuracy and generalization capabilities of audio deepfake detection.