• 4 and 7-bit Labeling for Projective and Non-Projective Dependency Trees 

      Gómez-Rodríguez, Carlos; Roca Rodríguez, Diego; Vilares, David (Association for Computational Linguistics, 2023-12)
      [Absctract]: We introduce an encoding for parsing as sequence labeling that can represent any projective dependency tree as a sequence of 4-bit labels, one per word. The bits in each word’s label represent (1) whether it ...
    • A non-projective greedy dependency parser with bidirectional LSTMs 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2017-08)
      [Abstract]: The LyS-FASTPARSE team present BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kiperwasser and Goldberg (2016) ...
    • A Transition-Based Algorithm for Unrestricted AMR Parsing 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2018-06)
      [Absctract]: Non-projective parsing can be useful to handle cycles and reentrancy in AMR graphs. We explore this idea and introduce a greedy left-to-right non-projective transition-based parser. At each parsing configuration, ...
    • A Unifying Theory of Transition-based and Sequence Labeling Parsing 

      Gómez-Rodríguez, Carlos; Strzyz, Michalina; Vilares, David (International Committee on Computational Linguistics, 2020-12)
      [Absctract]: We define a mapping from transition-based parsing algorithms that read sentences from left to right to sequence labeling encodings of syntactic trees. This not only establishes a theoretical relation between ...
    • Another Dead End for Morphological Tags? Perturbed Inputs and Parsing 

      Muñoz-Ortiz, Alberto; Vilares, David (Association for Computational Linguistics, 2023-07)
      [Absctract]: The usefulness of part-of-speech tags for parsing has been heavily questioned due to the success of word-contextualized parsers. Yet, most studies are limited to coarse-grained tags and high quality written ...
    • Artificially Evolved Chunks for Morphosyntactic Analysis 

      Anderson, Mark; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2019-08)
      [Absctract]: We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks. We evaluate these chunks on a number of morphosyntactic tasks, namely POS tagging, ...
    • Assessment of Pre-Trained Models Across Languages and Grammars 

      Muñoz-Ortiz, Alberto; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2023-11)
      [Absctract]: We present an approach for assessing how multilingual large language models (LLMs) learn syntax in terms of multi-formalism syntactic structures. We aim to recover constituent and dependency structures by ...
    • Bertinho: Galician BERT Representations 

      Vilares, David; García, Marcos; Gómez-Rodríguez, Carlos (Sociedad Española para el Procesamiento del Lenguaje Natural, 2021-03)
      [Abstract]: This paper presents a monolingual BERT model for Galician. We follow the recent trend that shows that it is feasible to build robust monolingual BERT models even for relatively low-resource languages, while ...
    • Better, Faster, Stronger Sequence Tagging Constituent Parsers 

      Vilares, David; Abdou, Mostafa; Søgaard, Anders (Association for Computational Linguistics, 2019-06)
      [Absctract]: Sequence tagging models for constituent parsing are faster, but less accurate than other types of parsers. In this work, we address the following weaknesses of such constituent parsers: (a) high error rates ...
    • Bracketing Encodings for 2-Planar Dependency Parsing 

      Strzyz, Michalina; Vilares, David; Gómez-Rodríguez, Carlos (International Committee on Computational Linguistics, 2020-12)
      [Absctract]: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs ...
    • Constituent Parsing as Sequence Labeling 

      Gómez-Rodríguez, Carlos; Vilares, David (Association for Computational Linguistics (ACL), 2018)
      [Absctract]: We introduce a method to reduce constituent parsing to sequence labeling. For each word wt, it generates a label that encodes: (1) the number of ancestors in the tree that the words wt and wt+1 have in common, ...
    • Cross-lingual Inflection as a Data Augmentation Method for Parsing 

      Muñoz-Ortiz, Alberto; Gómez-Rodríguez, Carlos; Vilares, David (Association for Computational Linguistics, 2022-05)
      [Absctract]: We propose a morphology-based method for low-resource (LR) dependency parsing. We train a morphological inflector for target LR languages, and apply it to related rich-resource (RR) treebanks to create ...
    • Discontinuous Constituent Parsing as Sequence Labeling 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2020-11)
      [Absctract]: This paper reduces discontinuous parsing to sequence labeling. It first shows that existing reductions for constituent parsing as labeling do not support discontinuities. Second, it fills this gap and proposes ...
    • From Partial to Strictly Incremental Constituent Parsing 

      Ezquerro, Ana; Gómez-Rodríguez, Carlos; Vilares, David (Association for Computational Linguistics, 2024-03)
      [Absctract]: We study incremental constituent parsers to assess their capacity to output trees based on prefix representations alone. Guided by strictly left-to-right generative language models and tree-decoding modules, ...
    • Grounding the Semantics of Part-of-Day Nouns Worldwide using Twitter 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2018-06)
      [Absctract]: The usage of part-of-day nouns, such as ‘night’, and their time-specific greetings (‘good night’), varies across languages and cultures. We show the possibilities that Twitter offers for studying the semantics ...
    • Harry Potter and the Action Prediction Challenge from Natural Language 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2019-06)
      [Absctract]: We explore the challenge of action prediction from textual descriptions of scenes, a testbed to approximate whether text inference can be used to predict upcoming actions. As a case of study, we consider the ...
    • HEAD-QA: A Healthcare Dataset for Complex Reasoning 

      Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2019-07)
      [Absctract]: We present HEAD-QA, a multi-choice question answering testbed to encourage research on complex reasoning. The questions come from exams to access a specialized position in the Spanish healthcare system, and ...
    • How important is syntactic parsing accuracy? An empirical evaluation on rule-based sentiment analysis 

      Gómez-Rodríguez, Carlos; Alonso-Alonso, Iago; Vilares, David (Springer, 2019)
      [Abstract]: Syntactic parsing, the process of obtaining the internal structure of sentences in natural languages, is a crucial task for artificial intelligence applications that need to extract meaning from natural language ...
    • Increasing NLP Parsing Efficiency with Chunking 

      Anderson, Mark Dáibhidh; Vilares, David (M D P I AG, 2018-09-19)
      [Abstract] We introduce a “Chunk-and-Pass” parsing technique influenced by a psycholinguistic model, where linguistic information is processed not word-by-word but rather in larger chunks of words. We present preliminary ...
    • LyS A Coruña at GUA-SPA@IberLEF2023. Multi-Task Learning with Large Language Model Encoders for Guarani-Spanish Code Switching Analysis 

      Muñoz Ortiz, Alberto; Vilares, David (2023)
      [Abstract] This paper introduces the LyS A Coruña proposal for the Guarani-Spanish Code Switching Analysis task at IberLEF2023. The shared task proposes to analyze Guarani-Spanish code-switched texts, focusing on language ...