Buscar

Mostrando ítems 1-5 de 5

Studying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrieval

Vilares, Jesús; Alonso, Miguel A.; Doval, Yerai; Vilares Ferro, Manuel (2016-07)

[Abstract] The performance of Information Retrieval systems is limited by the linguistic variation present in natural language texts. Word-level Natural Language Processing techniques have been shown to be useful in reducing ...

Segmentación de palabras en español mediante modelos del lenguaje basados en redes neuronales

Doval, Yerai; Gómez-Rodríguez, Carlos; Vilares, Jesús (Sociedad Española para el Procesamiento del Lenguaje Natural, 2016-09)

[Resumen] En las plataformas de microblogging abundan ciertos tokens especiales como los hashtags o las menciones en los que un grupo de palabras se escriben juntas sin espaciado entre ellas; p.ej.: #añobisiesto o ...

On the performance of phonetic algorithms in microtext normalization

Doval, Yerai; Vilares Ferro, Manuel; Vilares, Jesús (Elsevier, 2018-12-15)

[Abstract]: User–generated content published on microblogging social networks constitutes a priceless source of information. However, microtexts usually deviate from the standard lexical and grammatical rules of the language, ...

Towards Robust Word Embeddings for Noisy Texts

Doval, Yerai; Vilares, Jesús; Gómez-Rodríguez, Carlos (MDPI, 2020)

[Abstract] Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing ...

Comparing neural- and N-gram-based language models for word segmentation

Doval, Yerai; Gómez-Rodríguez, Carlos (John Wiley and Sons Inc., 2019-02)

[Abstract]: Word segmentation is the task of inserting or deleting word boundary characters in order to separate character sequences that correspond to words in some language. In this article we propose an approach based ...