Loading...
Please wait, while we are loading the content...
Similar Documents
Predicting User Demographics , Emotions and Opinions in Social Networks
| Content Provider | Semantic Scholar |
|---|---|
| Author | Volkova, Svitlana |
| Copyright Year | 2016 |
| Abstract | Social networks are virtual environments where people express their thoughts, emotions, and opinions. We analyze a large sample of 123,000 Twitter users and 25 million tweets to investigate the relation between user emotions and predicted demographics. Our methodology is based on building machine learning models to predict emotions and demographic profiles from user content. We report novel demographic-affect correlations and its implications on online self-disclosure. AUDIENCE [Natural Language Processing] [Machine Learning] [Social Network Analysis] [Opinion Mining] [Emotion Detection] [Demographic Classification] [Advanced Technical Talk] INTRODUCTION Twitter and Facebook are prominent social networks, used regularly by over 1/7th of the world's population. Researchers used the massive volumes of data to study how users present themselves and the language they use [1], showing how to predict user psycho-demographic profiles [2,3], user emotions [4], and well-being. This study analyzes user communications in a social network on a large scale— 25 million tweets, 123,513 user profiles—examining a range of automatically detected emotions, opinions, and a variety of demographic traits. This work can help social network users to understand how others may perceive them based on how they communicate in social media, in addition to its evident applications in online sales and marketing, targeted advertising, large-scale polling, and healthcare analytics. DATA Focusing on Twitter, we use crowdsourcing to get demographic labels for a sample of U = 5,000 users (Table 1), and train machine learning classifiers to predict these demographic traits from the textual content generated by these users. We then apply attribute classifiers to get labels for a much larger sample of U = 123,513 users. We use a similar method for labeling emotions expressed in user text. We train an emotion classifier on an initial sample of TE = 52,925 tweets, then use the classifier to get emotion labels for a much larger sample of T = 24,919,528 tweets. However, rather than obtaining emotion tags for the initial sample through crowdsourcing, we use tweets annotated with emotional hashtags such as #disgust or #anger, identifying a specific emotion. To perform a reliable analysis of the differences in emotion expressed by different user groups, our demographic and emotion predictions must be highly accurate. We report that our models for emotion and demographic classification outperform the existing state-of-the-art systems. Our high-level methodology is shown in Figure 1. Figure 1. Our approach for predicting user demographics, emotions, and opinions in social media. Attribute Binary Attribute Values Gender Male: 2,124, Female: 2,874 Age Below 25: 2,511, Above 25: 1,372 Ethnicity Afr. Amer.: 1,705, Caucasian: 2,409 Education High school: 3,423, Degree: 1,575 Income <$35K: 3,324, >$35K: 1,675 Children Yes: 797, No: 4203 Optimism Pessimist: 907, Optimist: 2,655 Life Satisfaction Dissatisfied: 840, Satisfied: 2,949 Table 1. Profile annotations collected via crowdsourcing. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.cs.jhu.edu/~svitlana/papers/V_GHC.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |