Listar Lingua e Sociedade da Información (Language in the Information Society) (LYS) por título

Another Dead End for Morphological Tags? Perturbed Inputs and Parsing

Muñoz-Ortiz, Alberto; Vilares, David (Association for Computational Linguistics, 2023-07)

[Absctract]: The usefulness of part-of-speech tags for parsing has been heavily questioned due to the success of word-contextualized parsers. Yet, most studies are limited to coarse-grained tags and high quality written ...

Any papyrus about "a hand over a stool and a bread loaf, followed by a boat"? Dealing with hieroglyphic texts in IR

Iglesias-Franjo, Estíbaliz; Vilares, Jesús (ACM International Conference Proceeding Series, 2016-06)

[Abstract] Digital Heritage deals with the use of computing and information technologies for the preservation and study of the human cultural legacy. Within this context, we present here a Text Retrieval system developed ...

Una aproximación supervisada para la minería de opiniones sobre tuits en español en base a conocimiento lingüístico

Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Sociedad Española para el Procesamiento del Lenguaje Natural, 2013)

[Resumen]: En este artículo se describe un sistema para la clasificación de la polaridad de tuits escritos en español. Se adopta una aproximación híbrida, que combina conocimiento lingüístico obtenido mediante PLN con ...

Artificially Evolved Chunks for Morphosyntactic Analysis

Anderson, Mark; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2019-08)

[Absctract]: We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks. We evaluate these chunks on a number of morphosyntactic tasks, namely POS tagging, ...

Assessment of Pre-Trained Models Across Languages and Grammars

Muñoz-Ortiz, Alberto; Vilares, David; Gómez-Rodríguez, Carlos (Association for Computational Linguistics, 2023-11)

[Absctract]: We present an approach for assessing how multilingual large language models (LLMs) learn syntax in terms of multi-formalism syntactic structures. We aim to recover constituent and dependency structures by ...

Better, Faster, Stronger Sequence Tagging Constituent Parsers

Vilares, David; Abdou, Mostafa; Søgaard, Anders (Association for Computational Linguistics, 2019-06)

[Absctract]: Sequence tagging models for constituent parsing are faster, but less accurate than other types of parsers. In this work, we address the following weaknesses of such constituent parsers: (a) high error rates ...

Bracketing Encodings for 2-Planar Dependency Parsing

Strzyz, Michalina; Vilares, David; Gómez-Rodríguez, Carlos (International Committee on Computational Linguistics, 2020-12)

[Absctract]: We present a bracketing-based encoding that can be used to represent any 2-planar dependency tree over a sentence of length n as a sequence of n labels, hence providing almost total coverage of crossing arcs ...

Building a New Sentiment Analysis Dataset for Uzbek Language and Creating Baseline Models

Kuriyozov, Elmurod; Matlatipov, Sanatbek (2019-08-02)

[Abstract] Making natural language processing technologies available for low-resource languages is an important goal to improve the access to technology in their communities of speakers. In this paper, we provide the first ...

Clasificación de polaridad en textos con opiniones en español mediante análisis sintáctico de dependencias

Vilares, David; Alonso, Miguel A.; Gómez-Rodríguez, Carlos (Sociedad Española para el Procesamiento del Lenguaje Natural, 2013)

[Resumen]: En este artículo se describe un sistema de minería de opiniones que clasifica la polaridad de textos en español. Se propone una aproximación basada en PLN que conlleva realizar una segmentación, tokenización y ...

Cognitive Constraints Built into Formal Grammars: Implications for Language Evolution

Gómez-Rodríguez, Carlos; Christiansen, Morten H.; Ferrer-i-Cancho, Ramon (Ravignani, A., Barbieri, C., Martins, M., Flaherty, M., Jadoul, Y., Lattenkamp, E., Little, H., Mudd, K., Verhoef, T., 2020-04-17)

[Abstract] We study the validity of the cognitive independence assumption using an ensemble of artificial syntactic structures from various classes of dependency grammars. Our findings show that memory limitations have ...

Constituent Parsing as Sequence Labeling

Gómez-Rodríguez, Carlos; Vilares, David (Association for Computational Linguistics (ACL), 2018)

[Absctract]: We introduce a method to reduce constituent parsing to sequence labeling. For each word wt, it generates a label that encodes: (1) the number of ancestors in the tree that the words wt and wt+1 have in common, ...

Construcción de una lista de colocaciones para medir la competencia colocacional

Orol-González, Ana (Centro Virtual Cervantes, 2015)

[Abstrac] The aim of this work is to create a list of Spanish collocations with assessment purpose. For the creation of this list we have followed a set of previously established criteria which are based on lists of frequent ...

Creación de un treebank de dependencias universales mediante recursos existentes para lenguas próximas: el caso del gallego

García, Marcos; Gómez-Rodríguez, Carlos; Alonso, Miguel A. (Sociedad Española para el Procesamiento del Lenguaje Natural, 2016-09)

[Resumen] En este trabajo presentamos una nueva estrategia para crear treebanks de lenguas con pocos recursos para el análisis sintáctico. El método consiste en la adaptación y combinación de diferentes treebanks anotados ...