๐
๐
Old Age
Improving Contrastive Learning of Sentence Embeddings with Case-Augmented Positives and Retrieved Negatives
June 06, 2022 ยท Entered Twilight ยท ๐ Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Repo contents: LICENSE, README.md, data, examples, models, requirements.txt, scripts, training, utils
Authors
Wei Wang, Liangzhu Ge, Jingqiao Zhang, Cheng Yang
arXiv ID
2206.02457
Category
cs.CL: Computation & Language
Cross-listed
cs.IR
Citations
26
Venue
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Repository
https://github.com/alibaba/SimCSE-with-CARDS
โญ 16
Last Checked
1 month ago
Abstract
Following SimCSE, contrastive learning based methods have achieved the state-of-the-art (SOTA) performance in learning sentence embeddings. However, the unsupervised contrastive learning methods still lag far behind the supervised counterparts. We attribute this to the quality of positive and negative samples, and aim to improve both. Specifically, for positive samples, we propose switch-case augmentation to flip the case of the first letter of randomly selected words in a sentence. This is to counteract the intrinsic bias of pre-trained token embeddings to frequency, word cases and subwords. For negative samples, we sample hard negatives from the whole dataset based on a pre-trained language model. Combining the above two methods with SimCSE, our proposed Contrastive learning with Augmented and Retrieved Data for Sentence embedding (CARDS) method significantly surpasses the current SOTA on STS benchmarks in the unsupervised setting.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted