Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling

Use this link to cite
http://hdl.handle.net/2183/31181Collections
- Investigación (FIC) [1635]
Metadata
Show full item recordTitle
Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author ProfilingDate
2021Citation
PIOT-PEREZ-ABADIN, Paloma; MARTÍN-RODILLA, Patricia; PARAPAR, Javier. Experimental Analysis of the Relevance of Features and Effects on Gender Classification Models for Social Media Author Profiling. En ENASE. 2021. p. 103-113.
Abstract
[Abstract] Automatic user profiling from social networks has become a popular task due to its commercial applications
(targeted advertising, market studies...). Automatic profiling models infer demographic characteristics
of social network users from their generated content or interactions. Users’ demographic information is also
precious for more social worrying tasks such as automatic early detection of mental disorders. For this type
of users’ analysis tasks, it has been shown that the way how they use language is an important indicator which
contributes to the effectiveness of the models. Therefore, we also consider that for identifying aspects such as
gender, age or user’s origin, it is interesting to consider the use of the language both from psycho-linguistic
and semantic features. A good selection of features will be vital for the performance of retrieval, classification,
and decision-making software systems. In this paper, we will address gender classification as a part of the automatic
profiling task. We show an experimental analysis of the performance of existing gender classification
models based on external corpus and baselines for automatic profiling. We analyse in-depth the influence of
the linguistic features in the classification accuracy of the model. After that analysis, we have put together a
feature set for gender classification models in social networks with an accuracy performance above existing
baselines.
Keywords
Gender classification
Author profiling
Feature relevance
Social media
Author profiling
Feature relevance
Social media