5th Place Solution to Kaggle Google Universal Image Embedding Competition
October 18, 2022 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: .gitignore, LICENSE, README.md, directory_structure.txt, make_torchscript_wcls.py, models, preprocess_GPR1200.py, preprocess_food101.py, preprocess_glr2021_products10k.py, requirements.txt, src, tfrec2png.py, train_arcface.py, train_classifier.py
Authors
Noriaki Ota, Shingo Yokoi, Shinsuke Yamaoka
arXiv ID
2210.09495
Category
cs.CV: Computer Vision
Citations
2
Venue
arXiv.org
Repository
https://github.com/riron1206/kaggle-Google-Universal-Image-Embedding-Competition-5th-Place-Solution
โญ 7
Last Checked
3 months ago
Abstract
In this paper, we present our solution, which placed 5th in the kaggle Google Universal Image Embedding Competition in 2022. We use the ViT-H visual encoder of CLIP from the openclip repository as a backbone and train a head model composed of BatchNormalization and Linear layers using ArcFace. The dataset used was a subset of products10K, GLDv2, GPR1200, and Food101. And applying TTA for part of images also improves the score. With this method, we achieve a score of 0.684 on the public and 0.688 on the private leaderboard. Our code is available. https://github.com/riron1206/kaggle-Google-Universal-Image-Embedding-Competition-5th-Place-Solution
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
๐
๐
Old Age
Fast R-CNN
๐
๐
Old Age