๐
๐
Old Age
EUREKA: EUphemism Recognition Enhanced through Knn-based methods and Augmentation
October 23, 2022 ยท Entered Twilight ยท ๐ FLP
Repo contents: README.md, data, evaluate.py, knn_evaluate.py, notebooks, requirements.txt, train.py, trainer.py, utils.py
Authors
Sedrick Scott Keh, Rohit K. Bharadwaj, Emmy Liu, Simone Tedeschi, Varun Gangal, Roberto Navigli
arXiv ID
2210.12846
Category
cs.CL: Computation & Language
Citations
8
Venue
FLP
Repository
https://github.com/sedrickkeh/EUREKA
โญ 9
Last Checked
2 months ago
Abstract
We introduce EUREKA, an ensemble-based approach for performing automatic euphemism detection. We (1) identify and correct potentially mislabelled rows in the dataset, (2) curate an expanded corpus called EuphAug, (3) leverage model representations of Potentially Euphemistic Terms (PETs), and (4) explore using representations of semantically close sentences to aid in classification. Using our augmented dataset and kNN-based methods, EUREKA was able to achieve state-of-the-art results on the public leaderboard of the Euphemism Detection Shared Task, ranking first with a macro F1 score of 0.881. Our code is available at https://github.com/sedrickkeh/EUREKA.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted