Mostrar o rexistro simple do ítem

dc.contributor.authorAlonso-Alonso, Iago
dc.contributor.authorVilares, David
dc.contributor.authorGómez-Rodríguez, Carlos
dc.date.accessioned2024-05-23T08:10:00Z
dc.date.available2024-05-23T08:10:00Z
dc.date.issued2022-10
dc.identifier.citationIago Alonso-Alonso, David Vilares, and Carlos Gómez-Rodríguez. 2022. The Fragility of Multi-Treebank Parsing Evaluation. In Proceedings of the 29th International Conference on Computational Linguistics, pages 5345–5359, Gyeongju, Republic of Korea. International Committee on Computational Linguistics.es_ES
dc.identifier.urihttp://hdl.handle.net/2183/36590
dc.descriptionHeld in Gyeongju, Republic of Korea. October 12-17, 2022es_ES
dc.description.abstract[Absctract]: Treebank selection for parsing evaluation and the spurious effects that might arise from a biased choice have not been explored in detail. This paper studies how evaluating on a single subset of treebanks can lead to weak conclusions. First, we take a few contrasting parsers, and run them on subsets of treebanks proposed in previous work, whose use was justified (or not) on criteria such as typology or data scarcity. Second, we run a large-scale version of this experiment, create vast amounts of random subsets of treebanks, and compare on them many parsers whose scores are available. The results show substantial variability across subsets and that although establishing guidelines for good treebank selection is hard, some inadequate strategies can be easily avoided.es_ES
dc.description.sponsorshipThis work was supported by a 2020 Leonardo Grant for Researchers and Cultural Creators from the FBBVA,15 as well as by the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150). The work is also supported by ERDF/MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), by Xunta de Galicia (ED431C 2020/11), and by Centro de Investigación de Galicia “CITIC” which is funded by Xunta de Galicia, Spain and the European Union (ERDF - Galicia 2014–2020 Program), by grant ED431G 2019/01.es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2020/11es_ES
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01es_ES
dc.language.isoenges_ES
dc.publisherInternational Committee on Computational Linguisticses_ES
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/714150es_ES
dc.relationinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113230RB-C21/ES/MODELOS MULTITAREA DE ETIQUETADO SECUENCIAL PARA EL RECONOCIMIENTO DE ENTIDADES ENRIQUECIDO CON INFORMACIÓN LINGÜÍSTICA: SINTAXIS E INTEGRACIÓN MULTITAREA (SCANNER-UDC)es_ES
dc.relation.urihttps://aclanthology.org/2022.coling-1.475.pdfes_ES
dc.rightsAtribución 3.0 Españaes_ES
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subjectMulti-Treebank Parsing Evaluationes_ES
dc.subjectTreebank Selection Biases_ES
dc.subjectEvaluation Methodologyes_ES
dc.subjectParsing Performance Variabilityes_ES
dc.titleThe Fragility of Multi-Treebank Parsing Evaluationes_ES
dc.typeinfo:eu-repo/semantics/conferenceObjectes_ES
dc.rights.accessinfo:eu-repo/semantics/openAccesses_ES
UDC.journalTitleProceedings of the 29th International Conference on Computational Linguisticses_ES
UDC.startPage5345es_ES
UDC.endPage5359es_ES
UDC.conferenceTitle29th International Conference on Computational Linguistics (COLING’2022)es_ES


Ficheiros no ítem

Thumbnail
Thumbnail

Este ítem aparece na(s) seguinte(s) colección(s)

Mostrar o rexistro simple do ítem