• 20 years of the Grammar Matrix: cross-linguistic hypothesis testing of increasingly complex interactions 

      Zamaraeva, Olga; Curtis, Chris; Emerson, Guy; Fokkens, Antske; Goodman, Michael Wayne; Howell, Kristen; Trimble, T.J.; Bender, Emily M. (Institute of Computer Science, Polish Academy of Sciences, 2022-10-20)
      [Abstract] The Grammar Matrix project is a meta-grammar engineering framework expressed in Head-driven Phrase Structure Grammar (HPSG) and Minimal Recursion Semantics (MRS). It automates grammar implementation and is thus ...
    • 4 and 7-bit Labeling for Projective and Non-Projective Dependency Trees 

      Gómez-Rodríguez, Carlos; Roca Rodríguez, Diego; Vilares, David (Association for Computational Linguistics, 2023-12)
      [Absctract]: We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word. The bits in each word’s label represent (1) whether it ...
    • A comparison of statistical association measures for identifying dependency-based collocations in various languages 

      García, Marcos; García Salido, Marcos; Alonso-Ramos, Margarita (Association for Computational Linguistics (ACL), 2019)
      [Abstract] This paper presents an exploration of different statistical association measures to automatically identify collocations from corpora in English, Portuguese, and Spanish. To evaluate the impact of the association ...
    • A linguistic approach for determining the topics of Spanish Twitter messages 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (SAGE Publications & CILIP, 2015)
      [Abstract]: The vast number of opinions and reviews provided in Twitter is helpful in order to make interesting findings about a given industry, but given the huge number of messages published every day, it is important ...
    • A non-projective greedy dependency parser with bidirectional LSTMs 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2017-08)
      [Abstract]: The LyS-FASTPARSE team present BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kiperwasser and Goldberg (2016) ...
    • A review on political analysis and social media 

      Vilares, David; Alonso, Miguel A. (Sociedad Española para el Procesamiento del Lenguaje Natural, 2016)
      [Abstract] In democratic countries, forecasting the voting intentions of citizens and knowing their opinions on major political parties and leaders is of great interest to the parties themselves, to the media, and to the ...
    • A syntactic approach for opinion mining on Spanish reviews 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Cambridge University Press, 2015-01)
      [Abstract]: We describe an opinion mining system which classifies the polarity of Spanish texts. We propose an NLP approach that undertakes pre-processing, tokenisation and POS tagging of texts to then obtain the syntactic ...
    • A Transition-Based Algorithm for Unrestricted AMR Parsing 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2018-06)
      [Absctract]: Non-projective parsing can be useful to handle cycles and reentrancy in AMR graphs. We explore this idea and introduce a greedy left-to-right non-projective transition-based parser. At each parsing configuration, ...
    • A Unifying Theory of Transition-based and Sequence Labeling Parsing 

      Gómez-Rodríguez, Carlos; Strzyz, Michalina; Vilares, David (International Committee on Computational Linguistics, 2020-12)
      [Absctract]: We define a mapping from transition-based parsing algorithms that read sentences from left to right to sequence labeling encodings of syntactic trees. This not only establishes a theoretical relation between ...
    • Absolute convergence and error thresholds in non-active adaptive sampling 

      Vilares Ferro, Manuel; Darriba Bilbao, Víctor M.; Vilares, Jesús (Elsevier Inc., 2022-05)
      [Abstract] Non-active adaptive sampling is a way of building machine learning models from a training data base which are supposed to dynamically and automatically derive guaranteed sample size. In this context and regardless ...
    • Alternances actantielles et la montée du possesseur: une étude de cas en espagnol 

      Alonso-Ramos, Margarita (2009)
      [Resumen]Este artículo estudia la realización sintáctica de los poseedores de un objeto directo como dependiente sintáctico del verbo, es decir, lo que se conoce como “el ascenso del poseedor”: besar a María en la frente ...
    • Another Dead End for Morphological Tags? Perturbed Inputs and Parsing 

      Muñoz-Ortiz, Alberto; Vilares, David (Association for Computational Linguistics, 2023-07)
      [Absctract]: The usefulness of part-of-speech tags for parsing has been heavily questioned due to the success of word-contextualized parsers. Yet, most studies are limited to coarse-grained tags and high quality written ...
    • Any papyrus about "a hand over a stool and a bread loaf, followed by a boat"? Dealing with hieroglyphic texts in IR 

      Iglesias-Franjo, Estíbaliz; Vilares, Jesús (ACM International Conference Proceeding Series, 2016-06)
      [Abstract] Digital Heritage deals with the use of computing and information technologies for the preservation and study of the human cultural legacy. Within this context, we present here a Text Retrieval system developed ...
    • Una aproximación supervisada para la minería de opiniones sobre tuits en español en base a conocimiento lingüístico 

      Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Sociedad Española para el Procesamiento del Lenguaje Natural, 2013)
      [Resumen]: En este artículo se describe un sistema para la clasificación de la polaridad de tuits escritos en español. Se adopta una aproximación híbrida, que combina conocimiento lingüístico obtenido mediante PLN con ...
    • Artificially Evolved Chunks for Morphosyntactic Analysis 

      Anderson, Mark; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2019-08)
      [Absctract]: We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks. We evaluate these chunks on a number of morphosyntactic tasks, namely POS tagging, ...
    • Asignación de niveles de aprendizaje a las colocaciones del Diccionario de Colocaciones del español 

      García Salido, Marcos; Alonso-Ramos, Margarita (Pontificia Universidad Católica de Valparaíso. Instituto de Literatura y Ciencias del Lenguaje, 2018)
      [Resumen] Este artículo propone un método para nivelar las colocaciones del Diccionario de Colocaciones del Español de acuerdo con los niveles propuestos en el MCER. Como criterio nivelador se ...
    • Assessment of Pre-Trained Models Across Languages and Grammars 

      Muñoz-Ortiz, Alberto; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2023-11)
      [Absctract]: We present an approach for assessing how multilingual large language models (LLMs) learn syntax in terms of multi-formalism syntactic structures. We aim to recover constituent and dependency structures by ...
    • BERTbek: A Pretrained Language Model for Uzbek 

      Kuriyozov, Elmurod; Vilares, David; Gómez-Rodríguez, Carlos (European Language Resources Association (ELRA), 2024-05)
      [Abstract]: Recent advances in neural networks based language representation made it possible for pretrained language models to outperform previous models in many downstream natural language processing (NLP) tasks. These ...
    • Bertinho: Galician BERT Representations 

      Vilares, David; García, Marcos; Gómez-Rodríguez, Carlos (Sociedad Española para el Procesamiento del Lenguaje Natural, 2021-03)
      [Abstract]: This paper presents a monolingual BERT model for Galician. We follow the recent trend that shows that it is feasible to build robust monolingual BERT models even for relatively low-resource languages, while ...
    • Better, Faster, Stronger Sequence Tagging Constituent Parsers 

      Vilares, David; Abdou, Mostafa; Søgaard, Anders (Association for Computational Linguistics, 2019-06)
      [Absctract]: Sequence tagging models for constituent parsing are faster, but less accurate than other types of parsers. In this work, we address the following weaknesses of such constituent parsers: (a) high error rates ...