Statistical language models for query-by-example spoken document retrieval

UDC.coleccionInvestigaciónes_ES
UDC.departamentoCiencias da Computación e Tecnoloxías da Informaciónes_ES
UDC.endPage7949es_ES
UDC.grupoInvInformation Retrieval Lab (IRlab)es_ES
UDC.institutoCentroCITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicaciónes_ES
UDC.journalTitleMultimedia Tools and Applicationses_ES
UDC.startPage7927es_ES
UDC.volume79es_ES
dc.contributor.authorLópez-Otero, Paula
dc.contributor.authorParapar, Javier
dc.contributor.authorBarreiro, Álvaro
dc.date.accessioned2025-03-07T16:49:46Z
dc.date.available2025-03-07T16:49:46Z
dc.date.issued2020-01
dc.descriptionThis version of the article has been accepted for publication, after peer review, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s11042-019-08522-z.es_ES
dc.description.abstract[Abstract]: Query-by-example spoken document retrieval (QbESDR) consists in, given a collection of documents, computing how likely a spoken query is present in each document. This is usually done by means of pattern matching techniques based on dynamic time warping (DTW), which leads to acceptable results but is inefficient in terms of query processing time. In this paper, the use of probabilistic retrieval models for information retrieval is applied to the QbESDR scenario. First, each document is represented by means of a language model, as commonly done in information retrieval, obtained by estimating the probability of the different n-grams extracted from automatic phone transcriptions of the documents. Then, the score of a query given a document can be computed following the query likelihood retrieval model. Besides the adaptation of this model to QbESDR, this paper presents two techniques that aim at enhancing the performance of this method. One of them consists in improving the language models of the documents by using several phone transcription hypotheses for each document. The other approach aims at re-ranking the retrieved documents by incorporating positional information to the system, which is achieved by string alignment of the query and document phone transcriptions. Experiments were performed on two large and heterogeneous datasets specifically designed for search on speech tasks, namely MediaEval 2013 Spoken Web Search (SWS 2013) and MediaEval 2014 Query-by-Example Search on Speech (QUESST 2014). The experimental results prove the validity of the proposed strategies for QbESDR. In addition, the performance when dealing with queries with word reorderings is superior to that exhibited by a DTW-based strategy, and the query processing time is smaller by several orders of magnitude.es_ES
dc.description.sponsorshipThis work has received financial support from projects RTI2018-093336-B-C22 (Ministerio de Ciencia, Innovación y Universidades and European Regional Development Fund – ERDF), GPC ED431B 2019/03 (Xunta de Galicia and ERDF) and accreditation ED431G/01 (Xunta de Galicia and ERDF).es_ES
dc.description.sponsorshipXunta de Galicia; ED431B 2019/03es_ES
dc.description.sponsorshipXunta de Galicia; ED431G/01es_ES
dc.identifier.citationLopez-Otero, P., Parapar, J. & Barreiro, A. Statistical language models for query-by-example spoken document retrieval. Multimed Tools Appl 79, 7927–7949 (2020). https://doi.org/10.1007/s11042-019-08522-zes_ES
dc.identifier.doi10.1007/s11042-019-08522-z
dc.identifier.issn1380-7501
dc.identifier.issn1573-7721
dc.identifier.urihttp://hdl.handle.net/2183/41331
dc.language.isoenges_ES
dc.publisherSpringeres_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-093336-B-C22/ES/TECNOLOGIAS PARA LA PREDICCION TEMPRANA DE SIGNOS RELACIONADOS CON TRASTORNOS PSICOLOGICOS (SUBPROYECTO UDC)es_ES
dc.relation.urihttps://doi.org/10.1007/s11042-019-08522-zes_ES
dc.rightsThis version of the article is subject to Springer Nature’s AM terms of use (https://www.springernature.com/gp/open-science/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s11042-019-08522-z.es_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectQuery-by-example spoken document retrievales_ES
dc.subjectPhone decodinges_ES
dc.subjectPhone n-gramses_ES
dc.subjectLanguage modelses_ES
dc.subjectMinimum edit distancees_ES
dc.titleStatistical language models for query-by-example spoken document retrievales_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublicationeec0c53b-d226-4e1f-b2f8-1d9719fa3b0a
relation.isAuthorOfPublicationfef1a9cb-e346-4e53-9811-192e144f09d0
relation.isAuthorOfPublicationa3e43020-ee28-428d-8087-2f3c1e20aa2c
relation.isAuthorOfPublication.latestForDiscoveryeec0c53b-d226-4e1f-b2f8-1d9719fa3b0a

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lopez_Otero_Paula_2020_Statistical_language_models_for_query-by-example_spoken_document_retrieval.pdf
Size:
1.96 MB
Format:
Adobe Portable Document Format
Description: