A framework for streamlined statistical prediction using topic models
April 15, 2019 Β· Declared Dead Β· π LaTeCH@NAACL-HLT
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Vanessa Glenny, Jonathan Tuke, Nigel Bean, Lewis Mitchell
arXiv ID
1904.06941
Category
stat.AP
Cross-listed
cs.CL
Citations
2
Venue
LaTeCH@NAACL-HLT
Last Checked
4 months ago
Abstract
In the Humanities and Social Sciences, there is increasing interest in approaches to information extraction, prediction, intelligent linkage, and dimension reduction applicable to large text corpora. With approaches in these fields being grounded in traditional statistical techniques, the need arises for frameworks whereby advanced NLP techniques such as topic modelling may be incorporated within classical methodologies. This paper provides a classical, supervised, statistical learning framework for prediction from text, using topic models as a data reduction method and the topics themselves as predictors, alongside typical statistical tools for predictive modelling. We apply this framework in a Social Sciences context (applied animal behaviour) as well as a Humanities context (narrative analysis) as examples of this framework. The results show that topic regression models perform comparably to their much less efficient equivalents that use individual words as predictors.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β stat.AP
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Sequence-to-point learning with neural networks for nonintrusive load monitoring
R.I.P.
π»
Ghosted
Predictive Business Process Monitoring with LSTM Neural Networks
R.I.P.
π»
Ghosted
Forecasting: theory and practice
R.I.P.
π»
Ghosted
Accurate estimation of influenza epidemics using Google search data via ARGO
R.I.P.
π»
Ghosted
Survey of resampling techniques for improving classification performance in unbalanced datasets
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted