Parsing as Pretraining

Vilares, David; Strzyz, Michalina; Søgaard, Anders; Gómez-Rodríguez, Carlos

Use this link to cite:

http://hdl.handle.net/2183/24893

Parsing as Pretraining

Files

Vilares_David_2020_Parsing_as_Pretraining.pdf (817.06 KB)

Identifiers

URI: http://hdl.handle.net/2183/24893

DOI: 10.1609/aaai.v34i05.6446

Publication date

2020

Authors

Vilares, David

Strzyz, Michalina

Søgaard, Anders

Gómez-Rodríguez, Carlos

Bibliographic citation

Vilares, D., Strzyz, M., Søgaard, A., & Gómez-Rodríguez, C. (2020). Parsing as Pretraining. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 9114-9121. https://doi.org/10.1609/aaai.v34i05.6446

Abstract

[Abstract] Recent analyses suggest that encoders pretrained for language modeling capture certain morpho-syntactic structure. However, probing frameworks for word vectors still do not report results on standard setups such as constituent and dependency parsing. This paper addresses this problem and does full parsing (on English) relying only on pretraining architectures – and no decoding. We first cast constituent and dependency parsing as sequence tagging. We then use a single feed-forward layer to directly map word vectors to labels that encode a linearized tree. This is used to: (i) see how far we can reach on syntax modelling with just pretrained encoders, and (ii) shed some light about the syntax-sensitivity of different word vectors (by freezing the weights of the pretraining network during training). For evaluation, we use bracketing F1-score and LAS, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the PTB (93.5%) and end-to-end EN-EWT UD (78.8%).