๐
๐
Old Age
Taken by Surprise: Contrast effect for Similarity Scores
August 18, 2023 ยท Entered Twilight ยท ๐ arXiv.org
Repo contents: .gitignore, LICENSE, README.md, assets, notebooks, pyproject.toml, setup.py, surprise_similarity
Authors
Thomas C. Bachlechner, Mario Martone, Marjorie Schillo
arXiv ID
2308.09765
Category
cs.CL: Computation & Language
Cross-listed
cs.AI,
cs.IR,
cs.LG
Citations
0
Venue
arXiv.org
Repository
https://github.com/MeetElise/surprise-similarity
โญ 11
Last Checked
3 months ago
Abstract
Accurately evaluating the similarity of object vector embeddings is of critical importance for natural language processing, information retrieval and classification tasks. Popular similarity scores (e.g cosine similarity) are based on pairs of embedding vectors and disregard the distribution of the ensemble from which objects are drawn. Human perception of object similarity significantly depends on the context in which the objects appear. In this work we propose the $\textit{surprise score}$, an ensemble-normalized similarity metric that encapsulates the contrast effect of human perception and significantly improves the classification performance on zero- and few-shot document classification tasks. This score quantifies the surprise to find a given similarity between two elements relative to the pairwise ensemble similarities. We evaluate this metric on zero/few shot classification and clustering tasks and typically find 10-15 % better performance compared to raw cosine similarity. Our code is available at https://github.com/MeetElise/surprise-similarity.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age