Towards the Automatic Construction of a Multilingual Dictionary of Collocations using Distributional Semantics

UDC.coleccionInvestigaciónes_ES
UDC.conferenceTitleElectronic lexicography in the 21st century. Proceedings of the eLex 2019 conference. 1-3 October 2019, Sintra, Portugal.es_ES
UDC.departamentoLetrases_ES
UDC.endPage762es_ES
UDC.grupoInvLingua e Sociedade da Información (LYS)es_ES
UDC.startPage747es_ES
dc.contributor.authorGarcía, Marcos
dc.contributor.authorGarcía Salido, Marcos
dc.contributor.authorAlonso-Ramos, Margarita
dc.date.accessioned2024-06-14T08:14:30Z
dc.date.available2024-06-14T08:14:30Z
dc.date.issued2019
dc.description.abstract[Abstract] This paper presents the method used to create a multilingual online dictionary of collocations of English, Portuguese, and Spanish. This resource is built automatically and contains three types of collocations: verb–object (e.g., “[to] issue [an] invoice”), adjective–noun (“deep shame”), and nominal compounds (“cigarette packet”). We take advantage of dependency parsing and statistical association measures to compile collocations of each language, and then we align them with their equivalents in the other languages by means of compositional methods which use cross-lingual models of distributional semantics. Collocations are extracted from large and assorted corpora, and the cross-lingual models are mapped using unsupervised approaches. For each collocation in a given language, the system shows different equivalents in the other languages, ranked by a confidence value. Besides the multilingual perspective, the resulting dictionary can also serve as a monolingual resource to retrieve the collocates of a given base, thus being a useful application to both native speakers and language learners. The dictionary will be published as an online tool, and all the resources generated in this research will be freely available.es_ES
dc.description.sponsorshipMinisterio de Economía, Industria y Competitividad; FFI2016-78299-Pes_ES
dc.description.sponsorshipXunta de Galicia; ED431B-2017/01es_ES
dc.description.sponsorshipJuan de la Cierva; IJCI-2016-29598es_ES
dc.description.sponsorshipXunta de Galicia; ED481D-2017-009es_ES
dc.identifier.citationGarcía, M., García Salido ,M, Alonso-Ramos,M. (2019). Towards the Automatic Construction of a Multilingual Dictionary of Collocations using Distributional Semantics. In Electronic lexicography in the 21st century. Proceedings of the eLex 2019 conference. Brno: Lexical Computing, pp.747-762es_ES
dc.identifier.urihttp://hdl.handle.net/2183/36922
dc.language.isoenges_ES
dc.publisherLexical Computinges_ES
dc.rightsAtribution ShareAlike 4.0 Internationales_ES
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-sa/3.0/es/*
dc.subjectCollocationses_ES
dc.subjectDistributional semanticses_ES
dc.subjectDictionaryes_ES
dc.subjectMultilingualityes_ES
dc.titleTowards the Automatic Construction of a Multilingual Dictionary of Collocations using Distributional Semanticses_ES
dc.typeconference outputes_ES
dspace.entity.typePublication
relation.isAuthorOfPublication8da895e1-853a-406d-ad80-959c213445bf
relation.isAuthorOfPublication39fc5a16-04b5-45c5-98e6-bb4ebfaa33a3
relation.isAuthorOfPublicatione8136ff9-ca3a-4775-80e6-3e046dcabf20
relation.isAuthorOfPublication.latestForDiscovery8da895e1-853a-406d-ad80-959c213445bf

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Garcia_Salido_Marcos_Garcia_Marcos_Alonso_Ramos_Marga_2019_Automatic_Construction_Multilingual_Dictionary.pdf
Size:
328.33 KB
Format:
Adobe Portable Document Format
Description: