MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding
May 15, 2024 · Declared Dead · 🏛 Visual Communications and Image Processing
"Paper promises code 'coming soon'"
Evidence collected by the PWNC Scanner
Authors
Jiajie Teng, Huiyu Duan, Yucheng Zhu, Sijing Wu, Guangtao Zhai
arXiv ID
2405.09286
Category
cs.MM: Multimedia
Cross-listed
cs.CV
Citations
3
Venue
Visual Communications and Image Processing
Last Checked
1 month ago
Abstract
Recent years have witnessed the rapid development of short videos, which usually contain both visual and audio modalities. Background music is important to the short videos, which can significantly influence the emotions of the viewers. However, at present, the background music of short videos is generally chosen by the video producer, and there is a lack of automatic music recommendation methods for short videos. This paper introduces MVBind, an innovative Music-Video embedding space Binding model for cross-modal retrieval. MVBind operates as a self-supervised approach, acquiring inherent knowledge of intermodal relationships directly from data, without the need of manual annotations. Additionally, to compensate the lack of a corresponding musical-visual pair dataset for short videos, we construct a dataset, SVM-10K(Short Video with Music-10K), which mainly consists of meticulously selected short videos. On this dataset, MVBind manifests significantly improved performance compared to other baseline methods. The constructed dataset and code will be released to facilitate future research.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
📜 Similar Papers
In the same crypt — Multimedia
R.I.P.
👻
Ghosted
🌅
🌅
Old Age
Quality Assessment of In-the-Wild Videos
R.I.P.
👻
Ghosted
Viewport-Adaptive Navigable 360-Degree Video Delivery
R.I.P.
👻
Ghosted
A Comprehensive Survey on Cross-modal Retrieval
R.I.P.
👻
Ghosted
An Overview of Cross-media Retrieval: Concepts, Methodologies, Benchmarks and Challenges
R.I.P.
👻
Ghosted
A Convolutional Neural Network Approach for Post-Processing in HEVC Intra Coding
Died the same way — ⏳ Coming Soon™
R.I.P.
⏳
Coming Soon™
Exploring Simple Siamese Representation Learning
R.I.P.
⏳
Coming Soon™
An Analysis of Scale Invariance in Object Detection - SNIP
R.I.P.
⏳
Coming Soon™
Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection
R.I.P.
⏳
Coming Soon™