Building a New Sentiment Analysis Dataset for Uzbek Language and Creating Baseline Models

Loading...
Thumbnail Image

Identifiers

Publication date

Authors

Kuriyozov, Elmurod
Matlatipov, Sanatbek

Advisors

Other responsabilities

Journal Title

Bibliographic citation

Kuriyozov, E.; Matlatipov, S. Building a New Sentiment Analysis Dataset for Uzbek Language and Creating Baseline Models. Proceedings 2019, 21, 37.

Type of academic work

Academic degree

Abstract

[Abstract] Making natural language processing technologies available for low-resource languages is an important goal to improve the access to technology in their communities of speakers. In this paper, we provide the first annotated corpora for polarity classification for Uzbek language. Our methodology considers collecting a medium-size manually annotated dataset and a larger-size dataset automatically translated from existing resources. Then, we use these datasets to train sentiment analysis models on the Uzbek language, using both traditional machine learning techniques and recent deep learning models.

Description

Rights

Creative Commons License Attribution (CC BY 4.0)
Creative Commons License Attribution (CC BY 4.0)

Except where otherwise noted, this item's license is described as Creative Commons License Attribution (CC BY 4.0)