Relevance feedback for building pooled test collections

Otero, David; Parapar, Javier; Barreiro, Álvaro

Relevance feedback for building pooled test collections

UDC.coleccion	Investigación	es_ES
UDC.departamento	Ciencias da Computación e Tecnoloxías da Información	es_ES
UDC.endPage	18	es_ES
UDC.grupoInv	Information Retrieval Lab (IRlab)	es_ES
UDC.institutoCentro	CITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación	es_ES
UDC.journalTitle	Journal of Information Science	es_ES
UDC.startPage	1	es_ES
dc.contributor.author	Otero, David
dc.contributor.author	Parapar, Javier
dc.contributor.author	Barreiro, Álvaro
dc.date.accessioned	2025-03-05T19:08:05Z
dc.date.available	2025-03-05T19:08:05Z
dc.date.issued	2023
dc.description	This manuscript version of the article: Otero, D., Parapar, J., & Barreiro, Á. (2023). ‘Relevance feedback for building pooled test collections’ has been accepted for publication in Journal of Information Science, 2023, pp-1-18. DOI: https://doi.org/10.1177/0165551515598926.	es_ES
dc.description.abstract	[Abstract]: Offline evaluation of information retrieval systems depends on test collections. These datasets provide the researchers with a corpus of documents, topics and relevance judgements indicating which documents are relevant for each topic. Gathering the latter is costly, requiring human assessors to judge the documents. Therefore, experts usually judge only a portion of the corpus. The most common approach for selecting that subset is pooling. By intelligently choosing which documents to assess, it is possible to optimise the number of positive labels for a given budget. For this reason, much work has focused on developing techniques to better select which documents from the corpus merit human assessments. In this article, we propose using relevance feedback to prioritise the documents when building new pooled test collections. We explore several state-of-the-art statistical feedback methods for prioritising the documents the algorithm presents to the assessors. A thorough comparison on eight Text Retrieval Conference (TREC) datasets against strong baselines shows that, among other results, our proposals improve in retrieving relevant documents with lower assessment effort than other state-of-the-art adjudicating methods without harming the reliability, fairness and reusability.	es_ES
dc.description.sponsorship	The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: this work has received support from: (1) project PLEC2021-007662 (grant no. MCIN/AEI/10.13039/501100011033, Ministerio de Ciencia e Innovación (MCIN), Agencia Estatal de Investigación, Plan de Recuperación, Transformación y Resiliencia, Unión Europea-NextGenerationEU), (2) Programa de Ayudas para la Formación de Profesorado Universitario, grant number FPU20/02659 (Ministerio de Universidades), (3) project PID2022-137061OB-C21 (Proyectos de Generación de Conocimiento, MCIN), (4) Consellería de Educación, Universidade e Formación Profesional (accreditation 2019-2022 ED431G 2019/01) and the European Regional Development Fund, which acknowledges the Centro de Investigación en Tecnologías de la Información y la Comunicación (CITIC) Research Centre in Information and Communications Technology (ICT) of the University of A Coruña as a Research Centre of the Galician University System and (5) project ED431-B 2022/33 (Xunta de Galicia/European Regional Development Fund (ERDF).	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431-B 2022/33	es_ES
dc.description.sponsorship	Xunta de Galicia; ED431G 2019/01	es_ES
dc.identifier.citation	Otero, D., Parapar, J., & Barreiro, Á. (2023). Relevance feedback for building pooled test collections. Journal of Information Science, [online first], pp. 1-18. https://doi.org/10.1177/01655515231171085	es_ES
dc.identifier.doi	10.1177/01655515231171085
dc.identifier.issn	0165-5515
dc.identifier.issn	1741-6485
dc.identifier.uri	http://hdl.handle.net/2183/41306
dc.language.iso	eng	es_ES
dc.publisher	SAGE Publications	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2024/PLEC2021-007662/ES/BIG-eRISK: PREDICCIÓN TEMPRANA DE RIESGOS PERSONALES EN CONJUNTOS DE DATOS MASIVOS	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MECD/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/FPU20%2F02659/ES/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2022-137061OB-C21/ES/BUSQUEDA, SELECCION Y ORGANIZACION DE CONTENIDOS PARA NECESIDADES DE INFORMACION RELACIONADAS CON LA SALUD - CONSTRUCCION DE RECURSOS Y PERSONALIZACION	es_ES
dc.relation.uri	https://doi.org/10.1177/01655515231171085	es_ES
dc.rights	Copyright © 2023 The Authors. Article reuse guidelines: sagepub.com/journals-permissions.	es_ES
dc.rights	Atribución-NoComercial-NoDerivates 4.0 Internacional
dc.rights.accessRights	open access	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subject	Pooling	es_ES
dc.subject	Relevance feedback	es_ES
dc.subject	Reranking	es_ES
dc.subject	Test collections	es_ES
dc.title	Relevance feedback for building pooled test collections	es_ES
dc.type	journal article	es_ES
dspace.entity.type	Publication
relation.isAuthorOfPublication	00d04042-9b75-419e-9aab-33fd14b201af
relation.isAuthorOfPublication	fef1a9cb-e346-4e53-9811-192e144f09d0
relation.isAuthorOfPublication	a3e43020-ee28-428d-8087-2f3c1e20aa2c
relation.isAuthorOfPublication.latestForDiscovery	00d04042-9b75-419e-9aab-33fd14b201af

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Otero_David_2023_Relevance_feedback_for_building_pooled_test_collections.pdf
Size:: 752.23 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Investigación (FIC)