How does depression talk on social media? Modeling depression language with relevance-based statistical language models
| UDC.coleccion | Investigación | |
| UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | |
| UDC.grupoInv | Information Retrieval Lab (IRlab) | |
| UDC.institutoCentro | CITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación | |
| UDC.journalTitle | Online Social Networks and Media (OSNEM) | |
| UDC.startPage | 100339 | |
| UDC.volume | 50 | |
| dc.contributor.author | Bao, Eliseo | |
| dc.contributor.author | Pérez, Anxo | |
| dc.contributor.author | Otero, David | |
| dc.contributor.author | Parapar, Javier | |
| dc.date.accessioned | 2025-11-05T12:57:04Z | |
| dc.date.available | 2025-11-05T12:57:04Z | |
| dc.date.issued | 2025-10-17 | |
| dc.description.abstract | [Abstract]: Many individuals with mental health problems turn to the internet and social media for information and support. The text generated on these platforms serves as a valuable resource for identifying mental health risks, driving interdisciplinary research to develop models for mental health analysis and prediction. In this paper, we model depression-related language using relevance-based statistical language models to create lexicons that characterize linguistic patterns associated with depression. We also propose a ranking method that leverages these lexicons to prioritize users exhibiting stronger signs of depressive language on social media. Our models integrate clinical markers from established depression questionnaires, particularly the Beck Depression Inventory-II (BDI-II), enhancing explainability, generalization, and performance. Experiments across multiple social media datasets show that incorporating clinical knowledge improves user ranking and generalizes effectively across platforms. Additionally, we refine existing depression lexicons by applying weights estimated from our models, achieving better performance in generating depression-related queries. A comparative analysis of our models highlights differences in language use between control users and those with depression, aligning with prior psycholinguistic findings. This work advances the understanding of depression-related language through statistical modeling, paving the way for scalable social media interventions to identify at-risk individuals | |
| dc.description.sponsorship | This work has received support from projects: PLEC2021-007662 (MCIN/AEI/10.13039/501100011033 Ministerio de Ciencia e Innovación, European Union NextGeneration) and PID2022-137061OB-C21 (MCIN/AEI/10.13039/501100011033/, Ministerio de Ciencia e Innovación, by the European Union); Consellería de Educación, Universidade e Formación Profesional, Spain (grant number ED481A-2024-079 and accreditation 2019–2022 ED431G/01 and GRC ED431C 2025/49) and the European Regional Development Fund , which acknowledges the CITIC Research Center | |
| dc.description.sponsorship | Xunta de Galicia; ED481A-2024-079 | |
| dc.description.sponsorship | Xunta de Galicia; ED431G/01 | |
| dc.description.sponsorship | Xunta de Galicia; GRC ED431C 2025/49 | |
| dc.identifier.citation | E. Bao, A. Perez, D. Otero, y J. Parapar, «How does depression talk on social media? Modeling depression language with relevance-based statistical language models», Online Social Networks and Media, vol. 50, p. 100339, dic. 2025, doi: 10.1016/j.osnem.2025.100339 | |
| dc.identifier.doi | 10.1016/j.osnem.2025.100339 | |
| dc.identifier.issn | 2468-6964 | |
| dc.identifier.uri | https://hdl.handle.net/2183/46281 | |
| dc.language.iso | eng | |
| dc.publisher | Elsevier | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PLEC2021-007662/ES/BIG-eRISK: PREDICCIÓN TEMPRANA DE RIESGOS PERSONALES EN CONJUNTOS DE DATOS MASIVOS | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-137061OB-C21/ES/BUSQUEDA, SELECCION Y ORGANIZACION DE CONTENIDOS PARA NECESIDADES DE INFORMACION RELACIONADAS CON LA SALUD - CONSTRUCCION DE RECURSOS Y PERSONALIZACION | |
| dc.relation.uri | https://doi.org/10.1016/j.osnem.2025.100339 | |
| dc.rights | © 2025 The Authors | |
| dc.rights | Attribution 4.0 International | en |
| dc.rights.accessRights | open access | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Mental health | |
| dc.subject | Depression | |
| dc.subject | Language modeling | |
| dc.subject | Natural language processing | |
| dc.subject | Text mining | |
| dc.subject | Social media | |
| dc.subject | User risk assessment | |
| dc.subject | Clinical markers | |
| dc.subject | Linguistic patterns | |
| dc.subject | Psycholinguistics | |
| dc.title | How does depression talk on social media? Modeling depression language with relevance-based statistical language models | |
| dc.type | journal article | |
| dc.type.hasVersion | VoR | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 99ed6581-6dee-442a-9b37-c35da63bef8a | |
| relation.isAuthorOfPublication | c673c8b1-1afc-48f6-85e9-8f29f9cffb91 | |
| relation.isAuthorOfPublication | 00d04042-9b75-419e-9aab-33fd14b201af | |
| relation.isAuthorOfPublication | fef1a9cb-e346-4e53-9811-192e144f09d0 | |
| relation.isAuthorOfPublication.latestForDiscovery | 99ed6581-6dee-442a-9b37-c35da63bef8a |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Bao_Eliseo_2025_How_does_depression_talk_socialmedia.pdf
- Size:
- 2.15 MB
- Format:
- Adobe Portable Document Format

