Studying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrieval

UDC.coleccionInvestigaciónes_ES
UDC.departamentoLetrases_ES
UDC.endPage657es_ES
UDC.grupoInvLingua e Sociedade da Información (LYS)es_ES
UDC.issue4es_ES
UDC.journalTitleInformation Processing & Managementes_ES
UDC.startPage646es_ES
UDC.volume52es_ES
dc.contributor.authorVilares, Jesús
dc.contributor.authorAlonso, Miguel A.
dc.contributor.authorDoval, Yerai
dc.contributor.authorVilares Ferro, Manuel
dc.date.accessioned2017-07-17T14:53:36Z
dc.date.available2017-07-17T14:53:36Z
dc.date.issued2016-07
dc.description.abstract[Abstract] The performance of Information Retrieval systems is limited by the linguistic variation present in natural language texts. Word-level Natural Language Processing techniques have been shown to be useful in reducing this variation. In this article, we summarize our work on the extension of these techniques for dealing with phrase-level variation in European languages, taking Spanish as a case in point. We propose the use of syntactic dependencies as complex index terms in an attempt to solve the problems deriving from both syntactic and morpho-syntactic variation and, in this way, to obtain more precise index terms. Such dependencies are obtained through a shallow parser based on cascades of finite-state transducers in order to reduce as far as possible the overhead due to this parsing process. The use of different sources of syntactic information, queries or documents, has been also studied, as has the restriction of the dependencies applied to those obtained from noun phrases. Our approaches have been tested using the CLEF corpus, obtaining consistent improvements with regard to classical word-level non-linguistic techniques. Results show, on the one hand, that syntactic information extracted from documents is more useful than that from queries. On the other hand, it has been demonstrated that by restricting dependencies to those corresponding to noun phrases, important reductions of storage and management costs can be achieved, albeit at the expense of a slight reduction in performance.es_ES
dc.description.sponsorshipMinisterio de Economía y Competitividad; FFI2014-51978-C2-1-Res_ES
dc.description.sponsorshipRede Galega de Procesamento da Linguaxe e Recuperación de Información; CN2014/034es_ES
dc.description.sponsorshipMinisterio de Economía y Competitividad; BES-2015-073768es_ES
dc.description.sponsorshipMinisterio de Economía y Competitividad; FFI2014-51978-C2-2-R
dc.identifier.citationJesús Vilares, Miguel A. Alonso, Yerai Doval and Manuel Vilares, Studying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrieval, Information Processing & Management, 52(4):646-657, 2016es_ES
dc.identifier.issn0306-4573
dc.identifier.urihttp://hdl.handle.net/2183/19290
dc.language.isoenges_ES
dc.relation.urihttp://www.sciencedirect.com/science/article/pii/S0306457315001478?via%3Dihubes_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectMisspelled querieses_ES
dc.subjectCross-Language information retrievales_ES
dc.subjectMachine translationes_ES
dc.subjectSpelling correctiones_ES
dc.subjectCharacter n-gramses_ES
dc.titleStudying the Effect and Treatment of Misspelled Queries in Cross-Language Information Retrievales_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublication3313b723-2288-4d9d-b0e7-32732c9c78d5
relation.isAuthorOfPublication1318edb8-3967-465c-a267-146624c05837
relation.isAuthorOfPublication3d821e9c-de0b-47cc-a4e0-7c531569602e
relation.isAuthorOfPublication.latestForDiscovery3313b723-2288-4d9d-b0e7-32732c9c78d5

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Vilares_Jesus_2016_Studying_the_Effect_and_Treatment_of_Misspelled_Queries_in_Cross-Language_Information_Retrieval.pdf
Size:
855.84 KB
Format:
Adobe Portable Document Format
Description: