Sec2Sec Co-attention for Video-Based Apparent Affective Prediction

August 27, 2024 ยท Entered Twilight ยท + Add venue

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: README.md, Sec2Sec_Co-attention_Transformer.pdf, dataloader.py, layer.py, train.py, utils.py

Authors Mingwei Sun, Kunpeng Zhang arXiv ID 2408.15209 Category cs.MM: Multimedia Citations 0 Repository https://github.com/nestor-sun/sec2sec โญ 8 Last Checked 3 months ago
Abstract
Video-based apparent affect detection plays a crucial role in video understanding, as it encompasses various elements such as vision, audio, audio-visual interactions, and spatiotemporal information, which are essential for accurate video predictions. However, existing approaches often focus on extracting only a subset of these elements, resulting in the limited predictive capacity of their models. To address this limitation, we propose a novel LSTM-based network augmented with a Transformer co-attention mechanism for predicting apparent affect in videos. We demonstrate that our proposed Sec2Sec Co-attention Transformer surpasses multiple state-of-the-art methods in predicting apparent affect on two widely used datasets: LIRIS-ACCEDE and First Impressions. Notably, our model offers interpretability, allowing us to examine the contributions of different time points to the overall prediction. The implementation is available at: https://github.com/nestor-sun/sec2sec.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Multimedia

R.I.P. ๐Ÿ‘ป Ghosted

Video Generation From Text

Yitong Li, Martin Renqiang Min, ... (+3 more)

cs.MM ๐Ÿ› AAAI ๐Ÿ“š 300 cites 8 years ago