• LyS at TASS 2015: Deep Learning Experiments for Sentiment Analysis on Spanish Tweets 

      Vilares, David; Doval, Yerai; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (CEUR-WS Workshop Proceedings, 2015)
      [Abstract]: This paper describes the participation of the LyS group at tass 2015. In this year’s edition, we used a long short-term memory neural network to address the two proposed challenges: (1) sentiment analysis at ...
    • LyS: Porting a Twitter Sentiment Analysis Approach from Spanish to English 

      Vilares, David; Hermo, Miguel; Alonso, Miguel A.; Gómez-Rodríguez, Carlos; Doval, Yerai (Association for Computational Linguistics, 2014)
      [Abstract]: This paper proposes an approach to solve message- and phrase-level polarity classification in Twitter, derived from an existing system designed for Spanish. As a first step, an ad-hoc preprocessing is performed. ...
    • LyS_ACoruña at SemEval-2022 Task 10: Repurposing Off-the-Shelf Tools for Sentiment Analysis as Semantic Dependency Parsing 

      Alonso-Alonso, Iago; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2022-07)
      [Absctract]: This paper addressed the problem of structured sentiment analysis using a bi-affine semantic dependency parser, large pre-trained language models, and publicly available translation models. For the monolingual ...
    • Memory limitations are hidden in grammar 

      Gómez-Rodríguez, Carlos; Christiansen, Morten H.; Ferrer-i-Cancho, Ramon (RAM-Verlag, 2022)
      [Abstract] The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars ...
    • Multidimensional Affective Analysis for Low-Resource Languages: A Use Case with Guarani-Spanish Code-Switching Language 

      Agüero-Torales, Marvin M.; López-Herrera, Antonio G.; Vilares, David (Springer, 2023)
      [Abstract]: This paper focuses on text-based affective computing for Jopara, a code-switching language that combines Guarani and Spanish. First, we collected a dataset of tweets primarily written in Guarani and annotated ...
    • Multitask Pointer Network for Multi-Representational Parsing 

      Fernández-González, Daniel; Gómez-Rodríguez, Carlos (Elsevier, 2022-01-25)
      [Abstract] Dependency and constituent trees are widely used by many artificial intelligence applications for representing the syntactic structure of human languages. Typically, these structures are separately produced by ...
    • Natural Language Parsing : Progress and Challenges 

      Gómez-Rodríguez, Carlos (Sociedad de Estadística e Investigación Operativa, 2018-07)
      [Abstract] Natural language parsing is the task of automatically obtaining the syntactic structure of sentences written in a human language. Parsing is a crucial step for language processing systems that need to extract ...
    • New Treebank or Repurposed? On the Feasibility of Cross-Lingual Parsing of Romance Languages with Universal Dependencies 

      García, Marcos; Gómez-Rodríguez, Carlos; Alonso, Miguel A. (Cambridge University Press, 2018-01)
      [Abstract] This paper addresses the feasibility of cross-lingual parsing with Universal Dependencies (UD) between Romance languages, analyzing its performance when compared to the use of manually annotated resources of the ...
    • Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing 

      Muñoz-Ortiz, Alberto; Strzyz, Michalina; Vilares, David (INCOMA Ltd., 2021-09)
      [Absctract]: Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket ...
    • On the Challenges of Fully Incremental Neural Dependency Parsing 

      Ezquerro, Ana; Gómez-Rodríguez, Carlos; Vilares, David (Association for Computational Linguistics, 2023-11)
      [Absctract]: Since the popularization of BiLSTMs and Transformer-based bidirectional encoders, state-of-the-art syntactic parsers have lacked incrementality, requiring access to the whole sentence and deviating from ...
    • On the Feasibility of Character n-Grams Pseudo-Translation for Cross-Language Information Retrieval Tasks 

      Vilares, Jesús; Vilares Ferro, Manuel; Alonso, Miguel A.; Oakes, Michael P. (2016-03)
      [Abstract] The field of Cross-Language Information Retrieval relates techniques close to both the Machine Translation and Information Retrieval fields, although in a context involving characteristics of its own. The present ...
    • On the Logistical Difficulties and Findings of Jopara Sentiment Analysis 

      Agüero-Torales, Marvin M.; Vilares, David; López-Herrera, Antonio G. (Association for Computational Linguistics, 2021-06)
      [Abstract] This paper addresses the problem of sentiment analysis for Jopara, a code-switching language between Guarani and Spanish. We first collect a corpus of Guarani-dominant tweets and discuss on the difficulties of ...
    • On the performance of phonetic algorithms in microtext normalization 

      Doval, Yerai; Vilares Ferro, Manuel; Vilares, Jesús (Elsevier, 2018-12-15)
      [Abstract]: User–generated content published on microblogging social networks constitutes a priceless source of information. However, microtexts usually deviate from the standard lexical and grammatical rules of the language, ...
    • On the Processing and Analysis of Microtexts: From Normalization to Semantics 

      Doval, Yerai; Vilares, David (M D P I AG, 2018-09-18)
      [Abstract] User-generated content published on microblogging social platforms constitutes an invaluable source of information for diverse purposes: health surveillance, business intelligence, political analysis, etc. We ...
    • On the Use of Parsing for Named Entity Recognition 

      Alonso, Miguel A.; Gómez-Rodríguez, Carlos; Vilares, Jesús (MDPI, 2021-01-25)
      [Abstract] Parsing is a core natural language processing technique that can be used to obtain the structure underlying sentences in human languages. Named entity recognition (NER) is the task of identifying the entities ...
    • On the usefulness of lexical and syntactic processing in polarity classification of Twitter messages 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Wiley, 2015-09)
      [Abstract]: Millions of micro texts are published every day on Twitter. Identifying the sentiment present in them can be helpful for measuring the frame of mind of the public, their satisfaction with respect to a product, ...
    • Optimality of syntactic dependency distances 

      Ferrer-i-Cancho, Ramon; Gómez-Rodríguez, Carlos; Esteban, Juan Luis; Alemany-Puig, Lluís (American Physical Society, 2022-01)
      [Abstract]: It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality ...
    • Parsing as Pretraining 

      Vilares, David; Strzyz, Michalina; Søgaard, Anders; Gómez-Rodríguez, Carlos (2020)
      [Abstract] Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups ...
    • Parsing linearizations appreciate PoS tags - but some are fussy about errors 

      Muñoz-Ortiz, Alberto; Anderson, Mark; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2022-11)
      [Absctract]: PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning. Recent work on the impact of PoS tags on graph- and ...
    • Prototipado rápido de un sistema de normalización de tuitsuna aproximación léxica 

      Vilares, Jesús; Alonso, Miguel A.; Vilares, David (Sociedad Española para el Procesamiento del Lenguaje Natural, 2013)
      [Resumen]: Este trabajo describe el sistema de normalización de tuits en español desarrollado por el Grupo de Lengua Y Sociedad de la Información (LYS) de la Universidade da Coruña para el Tweet-Norm 2013. Se trata de un ...