DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection

November 03, 2020 · Declared Dead · 🏛 International Conference on Machine Learning and Applications

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Joshua Melton, Arunkumar Bagavathi, Siddharth Krishnan arXiv ID 2011.01861 Category cs.CL: Computation & Language Cross-listed cs.LG Citations 21 Venue International Conference on Machine Learning and Applications Last Checked 4 months ago

Abstract

Online hate speech on social media has become a fast-growing problem in recent times. Nefarious groups have developed large content delivery networks across several main-stream (Twitter and Facebook) and fringe (Gab, 4chan, 8chan, etc.) outlets to deliver cascades of hate messages directed both at individuals and communities. Thus addressing these issues has become a top priority for large-scale social media outlets. Three key challenges in automated detection and classification of hateful content are the lack of clearly labeled data, evolving vocabulary and lexicon - hashtags, emojis, etc. - and the lack of baseline models for fringe outlets such as Gab. In this work, we propose a novel framework with three major contributions. (a) We engineer an ensemble of deep learning models that combines the strengths of state-of-the-art approaches, (b) we incorporate a tuning factor into this framework that leverages transfer learning to conduct automated hate speech classification on unlabeled datasets, like Gab, and (c) we develop a weak supervised learning methodology that allows our framework to train on unlabeled data. Our ensemble models achieve an 83% hate recall on the HON dataset, surpassing the performance of the state-of-the-art deep models. We demonstrate that weak supervised training in combination with classifier tuning significantly increases model performance on unlabeled data from Gab, achieving a hate recall of 67%.