EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis
Use this link to cite
http://hdl.handle.net/2183/34966
Except where otherwise noted, this item's license is described as Atribución-NoComercial 4.0 Internacional
Collections
Metadata
Show full item recordTitle
EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment AnalysisDate
2016-05Citation
David Vilares, Miguel A. Alonso, and Carlos Gómez-Rodríguez. 2016. EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 4149–4153, Portorož, Slovenia. European Language Resources Association (ELRA).
Abstract
[Abstract]: Code-switching texts are those that contain terms in two or more different languages, and they appear increasingly often in social media. The aim of this paper is to provide a resource to the research community to evaluate the performance of sentiment classification techniques on this complex multilingual environment, proposing an English-Spanish corpus of tweets with code-switching (EN-ES-CS CORPUS). The tweets are labeled according to two well-known criteria used for this purpose: SentiStrength and a trinary scale (positive, neutral and negative categories). Preliminary work on the resource is already done, providing a set of baselines for the research community.
Keywords
Sentiment Analysis
Corpus Generation
Code-Switching
Corpus Generation
Code-Switching
Editor version
Rights
Atribución-NoComercial 4.0 Internacional
ISBN
978-2-9517408-9-1