5th Place Solution to Kaggle Google Universal Image Embedding Competition

October 18, 2022 ยท Entered Twilight ยท ๐Ÿ› arXiv.org

๐Ÿ’ค TWILIGHT: Eternal Rest
Repo abandoned since publication

Repo contents: .gitignore, LICENSE, README.md, directory_structure.txt, make_torchscript_wcls.py, models, preprocess_GPR1200.py, preprocess_food101.py, preprocess_glr2021_products10k.py, requirements.txt, src, tfrec2png.py, train_arcface.py, train_classifier.py

Authors Noriaki Ota, Shingo Yokoi, Shinsuke Yamaoka arXiv ID 2210.09495 Category cs.CV: Computer Vision Citations 2 Venue arXiv.org Repository https://github.com/riron1206/kaggle-Google-Universal-Image-Embedding-Competition-5th-Place-Solution โญ 7 Last Checked 3 months ago
Abstract
In this paper, we present our solution, which placed 5th in the kaggle Google Universal Image Embedding Competition in 2022. We use the ViT-H visual encoder of CLIP from the openclip repository as a backbone and train a head model composed of BatchNormalization and Linear layers using ArcFace. The dataset used was a subset of products10K, GLDv2, GPR1200, and Food101. And applying TTA for part of images also improves the score. With this method, we achieve a score of 0.684 on the public and 0.688 on the private leaderboard. Our code is available. https://github.com/riron1206/kaggle-Google-Universal-Image-Embedding-Competition-5th-Place-Solution
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computer Vision

๐ŸŒ… ๐ŸŒ… Old Age

Fast R-CNN

Ross Girshick

cs.CV ๐Ÿ› ICCV ๐Ÿ“š 27.7K cites 11 years ago