Sub-Story Detection in Twitter with Hierarchical Dirichlet Processes

June 11, 2016 Β· Declared Dead Β· πŸ› Information Processing & Management

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors P. K. Srijith, Mark Hepple, Kalina Bontcheva, Daniel Preotiuc-Pietro arXiv ID 1606.03561 Category cs.IR: Information Retrieval Cross-listed cs.SI Citations 53 Venue Information Processing & Management Last Checked 3 months ago
Abstract
Social media has now become the de facto information source on real world events. The challenge, however, due to the high volume and velocity nature of social media streams, is in how to follow all posts pertaining to a given event over time, a task referred to as story detection. Moreover, there are often several different stories pertaining to a given event, which we refer to as sub-stories and the corresponding task of their automatic detection as sub-story detection. This paper proposes hierarchical Dirichlet processes (HDP), a probabilistic topic model, as an effective method for automatic sub-story detection. HDP can learn sub-topics associated with sub-stories which enables it to handle subtle variations in sub-stories. It is compared with state- of-the-art story detection approaches based on locality sensitive hashing and spectral clustering. We demonstrate the superior performance of HDP for sub-story detection on real world Twitter data sets using various evaluation measures. The ability of HDP to learn sub-topics helps it to recall the sub- stories with high precision. Another contribution of this paper is in demonstrating that the conversational structures within the Twitter stream can be used to improve sub-story detection performance significantly.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Information Retrieval

Died the same way β€” πŸ‘» Ghosted