• Multidimensional Affective Analysis for Low-Resource Languages: A Use Case with Guarani-Spanish Code-Switching Language 

      Agüero-Torales, Marvin M.; López-Herrera, Antonio G.; Vilares, David (Springer, 2023)
      [Abstract]: This paper focuses on text-based affective computing for Jopara, a code-switching language that combines Guarani and Spanish. First, we collected a dataset of tweets primarily written in Guarani and annotated ...
    • Multitask Pointer Network for Multi-Representational Parsing 

      Fernández-González, Daniel; Gómez-Rodríguez, Carlos (Elsevier, 2022-01-25)
      [Abstract] Dependency and constituent trees are widely used by many artificial intelligence applications for representing the syntactic structure of human languages. Typically, these structures are separately produced by ...
    • Natural Language Parsing : Progress and Challenges 

      Gómez-Rodríguez, Carlos (Sociedad de Estadística e Investigación Operativa, 2018-07)
      [Abstract] Natural language parsing is the task of automatically obtaining the syntactic structure of sentences written in a human language. Parsing is a crucial step for language processing systems that need to extract ...
    • New Treebank or Repurposed? On the Feasibility of Cross-Lingual Parsing of Romance Languages with Universal Dependencies 

      García, Marcos; Gómez-Rodríguez, Carlos; Alonso, Miguel A. (Cambridge University Press, 2018-01)
      [Abstract] This paper addresses the feasibility of cross-lingual parsing with Universal Dependencies (UD) between Romance languages, analyzing its performance when compared to the use of manually annotated resources of the ...
    • On the Feasibility of Character n-Grams Pseudo-Translation for Cross-Language Information Retrieval Tasks 

      Vilares, Jesús; Vilares Ferro, Manuel; Alonso, Miguel A.; Oakes, Michael P. (2016-03)
      [Abstract] The field of Cross-Language Information Retrieval relates techniques close to both the Machine Translation and Information Retrieval fields, although in a context involving characteristics of its own. The present ...
    • On the performance of phonetic algorithms in microtext normalization 

      Doval, Yerai; Vilares Ferro, Manuel; Vilares, Jesús (Elsevier, 2018-12-15)
      [Abstract]: User–generated content published on microblogging social networks constitutes a priceless source of information. However, microtexts usually deviate from the standard lexical and grammatical rules of the language, ...
    • On the Use of Parsing for Named Entity Recognition 

      Alonso, Miguel A.; Gómez-Rodríguez, Carlos; Vilares, Jesús (MDPI, 2021-01-25)
      [Abstract] Parsing is a core natural language processing technique that can be used to obtain the structure underlying sentences in human languages. Named entity recognition (NER) is the task of identifying the entities ...
    • On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Wiley, 2015-09)
      [Abstract]: Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, ...
    • Optimality of syntactic dependency distances 

      Ferrer-i-Cancho, Ramon; Gómez-Rodríguez, Carlos; Esteban, Juan Luis; Alemany-Puig, Lluís (American Physical Society, 2022-01)
      [Abstract]: It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality ...
    • Public Sentiment Analysis and Topic Modeling Regarding COVID-19’s Three Waves of Total Lockdown: A Case Study on Movement Control Order in Malaysia 

      Alamoodi, A.H.; Baker, Mohammed Rashad; Albahri, O.S.; Zaidan, B.B.; Zaidan, A.A.; Wong, Wing-Kwong; Garfan, Salem; Albahri, A.S.; Alonso, Miguel A.; Jasim, Ali Najm; Baqer, M.J. (KSII, 2022-07-31)
      [Abstract] The COVID-19 pandemic has affected many aspects of human life. The pandemic not only caused millions of fatalities and problems but also changed public sentiment and behavior. Owing to the magnitude of this ...
    • Restricted Non-Projectivity: Coverage vs. Efficiency 

      Gómez-Rodríguez, Carlos (2016-12)
      [Abstract] In the last decade, various restricted classes of non-projective dependency trees have been proposed with the goal of achieving a good tradeoff between parsing efficiency and coverage of the syntactic structures ...
    • Segmentación de palabras en español mediante modelos del lenguaje basados en redes neuronales 

      Doval, Yerai; Gómez-Rodríguez, Carlos; Vilares, Jesús (Sociedad Española para el Procesamiento del Lenguaje Natural, 2016-09)
      [Resumen] En las plataformas de microblogging abundan ciertos tokens especiales como los hashtags o las menciones en los que un grupo de palabras se escriben juntas sin espaciado entre ellas; p.ej.: #añobisiesto o ...
    • Sentiment Analysis for Fake News Detection 

      Alonso, Miguel A.; Vilares, David; Gómez-Rodríguez, Carlos; Vilares, Jesús (MDPI, 2021)
      [Abstract] In recent years, we have witnessed a rise in fake news, i.e., provably false pieces of information created with the intention of deception. The dissemination of this type of news poses a serious threat to cohesion ...
    • Studying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrieval 

      Vilares, Jesús; Alonso, Miguel A.; Doval, Yerai; Vilares Ferro, Manuel (2016-07)
      [Abstract] The performance of Information Retrieval systems is limited by the linguistic variation present in natural language texts. Word-level Natural Language Processing techniques have been shown to be useful in reducing ...
    • Supervised sentiment analysis in multilingual environments 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Elsevier, 2017-05)
      [Abstract]: This article tackles the problem of performing multilingual polarity classification on Twitter, comparing three techniques: (1) a multilingual model trained on a multilingual dataset, obtained by fusing existing ...
    • Surfing the Modeling of pos Taggers in Low-Resource Scenarios 

      Vilares Ferro, Manuel; Darriba Bilbao, Víctor M.; Ribadas Pena, Francisco José; Graña Gil, Jorge (MDPI, 2022-09-27)
      [Abstract] The recent trend toward the application of deep structured techniques has revealed the limits of huge models in natural language processing. This has reawakened the interest in traditional machine learning ...
    • The Impact of Edge Displacement Vaserstein Distance on UD Parsing Performance 

      Anderson, Mark; Gómez-Rodríguez, Carlos (The MIT Press, 2022)
      [Abstract] We contribute to the discussion on parsing performance in NLP by introducing a measurement that evaluates the differences between the distributions of edge displacement (the directed distance of edges) seen in ...
    • The megaphone of the people? Spanish SentiStrength for real-time analysis of political tweets 

      Vilares, David; Thelwall, Mike; Alonso, Miguel A. (SAGE Publications & CILIP, 2015)
      [Abstract]: Twitter is an important platform for sharing opinions about politicians, parties and political decisions. These opinions can be exploited as a source of information to monitor the impact of politics on society. ...
    • The scaling of the minimum sum of edge lengths in uniformly random trees 

      Esteban, Juan Luis; Ferrer-i-Cancho, Ramon; Gómez-Rodríguez, Carlos (2016-06)
      [Abstract] The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this ...
    • Towards Robust Word Embeddings for Noisy Texts 

      Doval, Yerai; Vilares, Jesús; Gómez-Rodríguez, Carlos (MDPI, 2020)
      [Abstract] Research on word embeddings has mainly focused on improving their performance on standard corpora, disregarding the difficulties posed by noisy texts in the form of tweets and other types of non-standard writing ...