Use this link to cite:
http://hdl.handle.net/2183/36664 Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing
Loading...
Identifiers
Publication date
Authors
Advisors
Other responsabilities
Journal Title
Bibliographic citation
Alberto Muñoz-Ortiz, Michalina Strzyz, and David Vilares. 2021. Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 978–988, Held Online. INCOMA Ltd..
Type of academic work
Academic degree
Abstract
[Absctract]: Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transition sequences of a transition-based parser to words. Yet, there is little understanding about how these linearizations behave in low-resource setups. Here, we first study their data efficiency, simulating data-restricted setups from a diverse set of rich-resource treebanks. Second, we test whether such differences manifest in truly low-resource setups. The results show that head selection encodings are more data-efficient and perform better in an ideal (gold) framework, but that such advantage greatly vanishes in favour of bracketing formats when the running setup resembles a real-world low-resource configuration.
Description
It was held online, 1-3 September 2021.
Editor version
Rights
Atribución 3.0 España








