Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing
| UDC.coleccion | Investigación | es_ES |
| UDC.conferenceTitle | International Conference on Recent Advances in Natural Language Processing (RANLP 2021) | es_ES |
| UDC.departamento | Letras | es_ES |
| UDC.endPage | 988 | es_ES |
| UDC.grupoInv | Lingua e Sociedade da Información (LYS) | es_ES |
| UDC.journalTitle | Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) | es_ES |
| UDC.startPage | 978 | es_ES |
| dc.contributor.author | Muñoz-Ortiz, Alberto | |
| dc.contributor.author | Strzyz, Michalina | |
| dc.contributor.author | Vilares, David | |
| dc.date.accessioned | 2024-05-28T08:37:33Z | |
| dc.date.available | 2024-05-28T08:37:33Z | |
| dc.date.issued | 2021-09 | |
| dc.description | It was held online, 1-3 September 2021. | es_ES |
| dc.description.abstract | [Absctract]: Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transition sequences of a transition-based parser to words. Yet, there is little understanding about how these linearizations behave in low-resource setups. Here, we first study their data efficiency, simulating data-restricted setups from a diverse set of rich-resource treebanks. Second, we test whether such differences manifest in truly low-resource setups. The results show that head selection encodings are more data-efficient and perform better in an ideal (gold) framework, but that such advantage greatly vanishes in favour of bracketing formats when the running setup resembles a real-world low-resource configuration. | es_ES |
| dc.description.sponsorship | This work is supported by a 2020 Leonardo Grant for Researchers and Cultural Creators from the FBBVA. The work also receives funding from the European Research Council (FASTPARSE, grant agreement No 714150), from ERDF/MICINNAEI (ANSWER-ASAP, TIN2017-85160-C2-1- R, SCANNER, PID2020-113230RB-C21), from Xunta de Galicia (ED431C 2020/11), and from Centro de Investigacion de Galicia ‘CITIC’, funded ´ by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014- 2020 Program) by grant ED431G 2019/01. | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431C 2020/11 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431G 2019/01 | es_ES |
| dc.identifier.citation | Alberto Muñoz-Ortiz, Michalina Strzyz, and David Vilares. 2021. Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 978–988, Held Online. INCOMA Ltd.. | es_ES |
| dc.identifier.uri | http://hdl.handle.net/2183/36664 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | INCOMA Ltd. | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/714150 | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113230RB-C21/ES/MODELOS MULTITAREA DE ETIQUETADO SECUENCIAL PARA EL RECONOCIMIENTO DE ENTIDADES ENRIQUECIDO CON INFORMACIÓN LINGÜÍSTICA: SINTAXIS E INTEGRACIÓN MULTITAREA (SCANNER-UDC) | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-85160-C2-1-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDO | es_ES |
| dc.relation.uri | https://aclanthology.org/2021.ranlp-1.111/ | es_ES |
| dc.rights | Atribución 3.0 España | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by/3.0/es/ | * |
| dc.subject | Dependency Parsing | es_ES |
| dc.subject | Sequence Labeling | es_ES |
| dc.subject | Low-Resource NLP | es_ES |
| dc.subject | Data Efficiency | es_ES |
| dc.title | Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing | es_ES |
| dc.type | conference output | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | edf1cde8-d272-4a73-bdd3-9be2361b7651 | |
| relation.isAuthorOfPublication | 37dabbe9-f54f-43bb-960e-0bf3ac7e54eb | |
| relation.isAuthorOfPublication.latestForDiscovery | edf1cde8-d272-4a73-bdd3-9be2361b7651 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Muñoz_Ortiz_2021_Not_all_linearizations_equally_data_hungry.pdf
- Size:
- 361.71 KB
- Format:
- Adobe Portable Document Format
- Description:

