The Secret Lives of Names? Name Embeddings from Social Media
May 12, 2019 Β· Declared Dead Β· π Knowledge Discovery and Data Mining
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Junting Ye, Steven Skiena
arXiv ID
1905.04799
Category
cs.SI: Social & Info Networks
Cross-listed
cs.CL
Citations
28
Venue
Knowledge Discovery and Data Mining
Last Checked
3 months ago
Abstract
Your name tells a lot about you: your gender, ethnicity and so on. It has been shown that name embeddings are more effective in representing names than traditional substring features. However, our previous name embedding model is trained on private email data and are not publicly accessible. In this paper, we explore learning name embeddings from public Twitter data. We argue that Twitter embeddings have two key advantages: \textit{(i)} they can and will be publicly released to support research community. \textit{(ii)} even with a smaller training corpus, Twitter embeddings achieve similar performances on multiple tasks comparing to email embeddings. As a test case to show the power of name embeddings, we investigate the modeling of lifespans. We find it interesting that adding name embeddings can further improve the performances of models using demographic features, which are traditionally used for lifespan modeling. Through residual analysis, we observe that fine-grained groups (potentially reflecting socioeconomic status) are the latent contributing factors encoded in name embeddings. These were previously hidden to demographic models, and may help to enhance the predictive power of a wide class of research studies.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Social & Info Networks
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Fake News Detection on Social Media: A Data Mining Perspective
R.I.P.
π»
Ghosted
Natural Scales in Geographical Patterns
R.I.P.
π»
Ghosted
Representation Learning on Graphs: Methods and Applications
R.I.P.
π»
Ghosted
The COVID-19 Social Media Infodemic
R.I.P.
π»
Ghosted
OSMnx: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted