AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval
March 30, 2022 Β· Declared Dead Β· π Computer Vision and Pattern Recognition
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai
arXiv ID
2203.16062
Category
cs.CV: Computer Vision
Cross-listed
cs.IR
Citations
1
Venue
Computer Vision and Pattern Recognition
Last Checked
4 months ago
Abstract
Evaluation measures have a crucial impact on the direction of research. Therefore, it is of utmost importance to develop appropriate and reliable evaluation measures for new applications where conventional measures are not well suited. Video Moment Retrieval (VMR) is one such application, and the current practice is to use R@$K,ΞΈ$ for evaluating VMR systems. However, this measure has two disadvantages. First, it is rank-insensitive: It ignores the rank positions of successfully localised moments in the top-$K$ ranked list by treating the list as a set. Second, it binarizes the Intersection over Union (IoU) of each retrieved video moment using the threshold $ΞΈ$ and thereby ignoring fine-grained localisation quality of ranked moments. We propose an alternative measure for evaluating VMR, called Average Max IoU (AxIoU), which is free from the above two problems. We show that AxIoU satisfies two important axioms for VMR evaluation, namely, \textbf{Invariance against Redundant Moments} and \textbf{Monotonicity with respect to the Best Moment}, and also that R@$K,ΞΈ$ satisfies the first axiom only. We also empirically examine how AxIoU agrees with R@$K,ΞΈ$, as well as its stability with respect to change in the test data and human-annotated temporal boundaries.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Computer Vision
π
π
Old Age
π
π
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
π
π
Old Age
SSD: Single Shot MultiBox Detector
π
π
Old Age
Squeeze-and-Excitation Networks
π
π
Old Age
Fast R-CNN
π
π
Old Age
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted