From Post To Personality: Harnessing LLMs for MBTI Prediction in Social Media

August 28, 2025 · Declared Dead · 🏛 International Conference on Information and Knowledge Management

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Tian Ma, Kaiyu Feng, Yu Rong, Kangfei Zhao arXiv ID 2509.04461 Category cs.CL: Computation & Language Cross-listed cs.SI Citations 0 Venue International Conference on Information and Knowledge Management Last Checked 4 months ago

Abstract

Personality prediction from social media posts is a critical task that implies diverse applications in psychology and sociology. The Myers Briggs Type Indicator (MBTI), a popular personality inventory, has been traditionally predicted by machine learning (ML) and deep learning (DL) techniques. Recently, the success of Large Language Models (LLMs) has revealed their huge potential in understanding and inferring personality traits from social media content. However, directly exploiting LLMs for MBTI prediction faces two key challenges: the hallucination problem inherent in LLMs and the naturally imbalanced distribution of MBTI types in the population. In this paper, we propose PostToPersonality (PtoP), a novel LLM based framework for MBTI prediction from social media posts of individuals. Specifically, PtoP leverages Retrieval Augmented Generation with in context learning to mitigate hallucination in LLMs. Furthermore, we fine tune a pretrained LLM to improve model specification in MBTI understanding with synthetic minority oversampling, which balances the class imbalance by generating synthetic samples. Experiments conducted on a real world social media dataset demonstrate that PtoP achieves state of the art performance compared with 10 ML and DL baselines.