HateCheck: Functional Tests for Hate Speech Detection Models

December 31, 2020 · Declared Dead · 🏛 Annual Meeting of the Association for Computational Linguistics

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Paul Röttger, Bertram Vidgen, Dong Nguyen, Zeerak Waseem, Helen Margetts, Janet B. Pierrehumbert arXiv ID 2012.15606 Category cs.CL: Computation & Language Citations 334 Venue Annual Meeting of the Association for Computational Linguistics Last Checked 2 months ago

Abstract

Detecting online hate is a difficult task that even state-of-the-art models struggle with. Typically, hate speech detection models are evaluated by measuring their performance on held-out test data using metrics such as accuracy and F1 score. However, this approach makes it difficult to identify specific model weak points. It also risks overestimating generalisable model performance due to increasingly well-evidenced systematic gaps and biases in hate speech datasets. To enable more targeted diagnostic insights, we introduce HateCheck, a suite of functional tests for hate speech detection models. We specify 29 model functionalities motivated by a review of previous research and a series of interviews with civil society stakeholders. We craft test cases for each functionality and validate their quality through a structured annotation process. To illustrate HateCheck's utility, we test near-state-of-the-art transformer models as well as two popular commercial models, revealing critical model weaknesses.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Computation & Language

🌅 🌅 Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL 🏛 NeurIPS 📚 166.0K cites 8 years ago

🌅 🌅 Old Age

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, Ming-Wei Chang, ... (+2 more)

cs.CL 🏛 NAACL 📚 110.2K cites 7 years ago

R.I.P. 👻 Ghosted

Language Models are Few-Shot Learners

Tom B. Brown, Benjamin Mann, ... (+29 more)

cs.CL 🏛 NeurIPS 📚 54.2K cites 5 years ago

R.I.P. 👻 Ghosted

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, Myle Ott, ... (+8 more)

cs.CL 🏛 arXiv 📚 28.4K cites 6 years ago

R.I.P. 👻 Ghosted

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Mike Lewis, Yinhan Liu, ... (+6 more)

cs.CL 🏛 ACL 📚 12.3K cites 6 years ago

R.I.P. 👻 Ghosted

Deep contextualized word representations

Matthew E. Peters, Mark Neumann, ... (+5 more)

cs.CL 🏛 NAACL 📚 12.0K cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, Sam Gross, ... (+19 more)

cs.LG 🏛 NeurIPS 📚 49.7K cites 6 years ago

R.I.P. 👻 Ghosted

XGBoost: A Scalable Tree Boosting System

Tianqi Chen, Carlos Guestrin

cs.LG 🏛 KDD 📚 49.2K cites 10 years ago

R.I.P. 👻 Ghosted

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

cs.LG 🏛 ICML 📚 46.0K cites 11 years ago

R.I.P. 👻 Ghosted

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, Santosh Divvala, ... (+2 more)

cs.CV 🏛 CVPR 📚 43.4K cites 10 years ago