Constituent Parsing as Sequence Labeling

Gómez-Rodríguez, Carlos; Vilares, David

dc.contributor.author	Gómez-Rodríguez, Carlos
dc.contributor.author	Vilares, David
dc.date.accessioned	2024-01-23T13:52:57Z
dc.date.available	2024-01-23T13:52:57Z
dc.date.issued	2018
dc.identifier.citation	Carlos Gómez-Rodríguez and David Vilares. 2018. Constituent Parsing as Sequence Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1314–1324, Brussels, Belgium. Association for Computational Linguistics.	es_ES
dc.identifier.isbn	978-1-948087-84-1
dc.identifier.uri	http://hdl.handle.net/2183/35085
dc.description	EMNLP 2018, Square Meeting Center, Brussels. From October 31st through November 4th.	es_ES
dc.description.abstract	[Absctract]: We introduce a method to reduce constituent parsing to sequence labeling. For each word wt, it generates a label that encodes: (1) the number of ancestors in the tree that the words wt and wt+1 have in common, and (2) the nonterminal symbol at the lowest common ancestor. We first prove that the proposed encoding function is injective for any tree without unary branches. In practice, the approach is made extensible to all constituency trees by collapsing unary branches. We then use the PTB and CTB treebanks as testbeds and propose a set of fast baselines. We achieve 90% F-score on the PTB test set, outperforming the Vinyals et al. (2015) sequence-to-sequence parser. In addition, sacrificing some accuracy, our approach achieves the fastest constituent parsing speeds reported to date on PTB by a wide margin.	es_ES
dc.description.sponsorship	This work has received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150), from the TELEPARESUDC project (FFI2014-51978-C2-2-R) and the ANSWER-ASAP project (TIN2017-85160-C2-1-R) from MINECO, and from Xunta de Galicia (ED431B 2017/01). We gratefully acknowledge NVIDIA Corporation for the donation of a GTX Titan X GPU.	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431B 2017/01	es_ES
dc.language.iso	eng	es_ES
dc.publisher	Association for Computational Linguistics (ACL)	es_ES
dc.relation.uri	https://doi.org/10.18653/v1/D18-1162	es_ES
dc.rights	Atribución 3.0 España	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/es/	*
dc.subject	Constituent parsing	es_ES
dc.subject	Penn treebank	es_ES
dc.subject	Sequence labeling	es_ES
dc.subject	Nonterminal symbols	es_ES
dc.title	Constituent Parsing as Sequence Labeling	es_ES
dc.type	conference output	es_ES
dc.rights.accessRights	open access	es_ES
UDC.startPage	1314	es_ES
UDC.endPage	1324	es_ES
UDC.conferenceTitle	2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)	es_ES
UDC.coleccion	Investigación	es_ES
UDC.departamento	Letras	es_ES
UDC.grupoInv	Lingua e Sociedade da Información (LYS)	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/714150	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/TIN2017-85160-C2-1-R/ES/AVANCES EN NUEVOS SISTEMAS DE EXTRACCION DE RESPUESTAS CON ANALISIS SEMANTICO Y APRENDIZAJE PROFUNDO	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/FFI2014-51978-C2-2-R/ES/TECNOLOGIAS DE LA LENGUA PARA ANALISIS DE OPINIONES EN REDES SOCIALES: DEL TEXTO AL MICROTEXTO	es_ES

Ficheros en el ítem

Nombre:: license_rdf
Tamaño:: 1.337Kb
Formato:: application/rdf+xml

Ver/Abrir

Nombre:: GomezRodriguez_Carlos_2018_con ...
Tamaño:: 213.9Kb
Formato:: PDF

Ver/Abrir

Este ítem aparece en la(s) siguiente(s) colección(ones)

Investigación (FFIL) [877]

Mostrar el registro sencillo del ítem