๐
๐
Old Age
Aggregate, Decompose, and Fine-Tune: A Simple Yet Effective Factor-Tuning Method for Vision Transformer
November 12, 2023 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: Imagenet_loader.py, README.md, configs, execute.py, figures, methods, requirements.txt, run.sh, vtab.py
Authors
Dongping Chen
arXiv ID
2311.06749
Category
cs.CV: Computer Vision
Cross-listed
cs.AI,
cs.LG
Citations
4
Venue
arXiv.org
Repository
https://github.com/Dongping-Chen/EFFT-EFfective-Factor-Tuning
โญ 8
Last Checked
3 months ago
Abstract
Recent advancements have illuminated the efficacy of some tensorization-decomposition Parameter-Efficient Fine-Tuning methods like LoRA and FacT in the context of Vision Transformers (ViT). However, these methods grapple with the challenges of inadequately addressing inner- and cross-layer redundancy. To tackle this issue, we introduce EFfective Factor-Tuning (EFFT), a simple yet effective fine-tuning method. Within the VTAB-1K dataset, our EFFT surpasses all baselines, attaining state-of-the-art performance with a categorical average of 75.9% in top-1 accuracy with only 0.28% of the parameters for full fine-tuning. Considering the simplicity and efficacy of EFFT, it holds the potential to serve as a foundational benchmark. The code and model are now available at https://github.com/Dongping-Chen/EFFT-EFfective-Factor-Tuning.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computer Vision
๐
๐
Old Age
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
๐
๐
Old Age
SSD: Single Shot MultiBox Detector
๐
๐
Old Age
Squeeze-and-Excitation Networks
๐
๐
Old Age
Fast R-CNN
๐
๐
Old Age