Quantitative Analysis of Audio-Visual Tasks: An Information-Theoretic Perspective

September 29, 2024 ยท Declared Dead ยท ๐Ÿ› International Symposium on Chinese Spoken Language Processing

๐Ÿ‘ป CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Chen Chen, Xiaolou Li, Zehua Liu, Lantian Li, Dong Wang arXiv ID 2409.19575 Category cs.SD: Sound Cross-listed cs.CL, cs.MM, eess.AS Citations 2 Venue International Symposium on Chinese Spoken Language Processing Last Checked 3 months ago
Abstract
In the field of spoken language processing, audio-visual speech processing is receiving increasing research attention. Key components of this research include tasks such as lip reading, audio-visual speech recognition, and visual-to-speech synthesis. Although significant success has been achieved, theoretical analysis is still insufficient for audio-visual tasks. This paper presents a quantitative analysis based on information theory, focusing on information intersection between different modalities. Our results show that this analysis is valuable for understanding the difficulties of audio-visual processing tasks as well as the benefits that could be obtained by modality integration.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Sound

Died the same way โ€” ๐Ÿ‘ป Ghosted