Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing

UDC.coleccionInvestigaciónes_ES
UDC.conferenceTitleInternational Conference on Recent Advances in Natural Language Processing (RANLP 2021)es_ES
UDC.departamentoLetrases_ES
UDC.endPage988es_ES
UDC.grupoInvLingua e Sociedade da Información (LYS)es_ES
UDC.journalTitleProceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)es_ES
UDC.startPage978es_ES
dc.contributor.authorMuñoz-Ortiz, Alberto
dc.contributor.authorStrzyz, Michalina
dc.contributor.authorVilares, David
dc.date.accessioned2024-05-28T08:37:33Z
dc.date.available2024-05-28T08:37:33Z
dc.date.issued2021-09
dc.descriptionIt was held online, 1-3 September 2021.es_ES
dc.description.abstract[Absctract]: Different linearizations have been proposed to cast dependency parsing as sequence labeling and solve the task as: (i) a head selection problem, (ii) finding a representation of the token arcs as bracket strings, or (iii) associating partial transition sequences of a transition-based parser to words. Yet, there is little understanding about how these linearizations behave in low-resource setups. Here, we first study their data efficiency, simulating data-restricted setups from a diverse set of rich-resource treebanks. Second, we test whether such differences manifest in truly low-resource setups. The results show that head selection encodings are more data-efficient and perform better in an ideal (gold) framework, but that such advantage greatly vanishes in favour of bracketing formats when the running setup resembles a real-world low-resource configuration.es_ES
dc.description.sponsorshipThis work is supported by a 2020 Leonardo Grant for Researchers and Cultural Creators from the FBBVA. The work also receives funding from the European Research Council (FASTPARSE, grant agreement No 714150), from ERDF/MICINNAEI (ANSWER-ASAP, TIN2017-85160-C2-1- R, SCANNER, PID2020-113230RB-C21), from Xunta de Galicia (ED431C 2020/11), and from Centro de Investigacion de Galicia ‘CITIC’, funded ´ by Xunta de Galicia and the European Union (European Regional Development Fund- Galicia 2014- 2020 Program) by grant ED431G 2019/01.es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2020/11es_ES
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01es_ES
dc.identifier.citationAlberto Muñoz-Ortiz, Michalina Strzyz, and David Vilares. 2021. Not All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsing. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 978–988, Held Online. INCOMA Ltd..es_ES
dc.identifier.urihttp://hdl.handle.net/2183/36664
dc.language.isoenges_ES
dc.publisherINCOMA Ltd.es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/H2020/714150es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113230RB-C21/ES/MODELOS MULTITAREA DE ETIQUETADO SECUENCIAL PARA EL RECONOCIMIENTO DE ENTIDADES ENRIQUECIDO CON INFORMACIÓN LINGÜÍSTICA: SINTAXIS E INTEGRACIÓN MULTITAREA (SCANNER-UDC)es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-85160-C2-1-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDOes_ES
dc.relation.urihttps://aclanthology.org/2021.ranlp-1.111/es_ES
dc.rightsAtribución 3.0 Españaes_ES
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subjectDependency Parsinges_ES
dc.subjectSequence Labelinges_ES
dc.subjectLow-Resource NLPes_ES
dc.subjectData Efficiencyes_ES
dc.titleNot All Linearizations Are Equally Data-Hungry in Sequence Labeling Parsinges_ES
dc.typeconference outputes_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationedf1cde8-d272-4a73-bdd3-9be2361b7651
relation.isAuthorOfPublication37dabbe9-f54f-43bb-960e-0bf3ac7e54eb
relation.isAuthorOfPublication.latestForDiscoveryedf1cde8-d272-4a73-bdd3-9be2361b7651

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Muñoz_Ortiz_2021_Not_all_linearizations_equally_data_hungry.pdf
Size:
361.71 KB
Format:
Adobe Portable Document Format
Description: