The wisdom of the rankers: a cost-effective method for building pooled test collections without participant systems

Otero, David; Parapar, Javier; Barreiro, Álvaro

Use this link to cite:

http://hdl.handle.net/2183/41377

The wisdom of the rankers: a cost-effective method for building pooled test collections without participant systems

Files

Otero_David_2021_The_wisdom_of_the_rankers.pdf (4.35 MB)

Identifiers

URI: http://hdl.handle.net/2183/41377

DOI: 10.1145/3412841.3441947

Publication date

2021

Authors

Otero, David

Parapar, Javier

Barreiro, Álvaro

Bibliographic citation

David Otero, Javier Parapar, and Álvaro Barreiro. 2021. The Wisdom of the Rankers: A Cost-Effective Method for Building Pooled Test Collections with-out Participant Systems. In The 36th ACM/SIGAPP Symposium on Applied Computing (SAC ’21), March 22–26, 2021, Virtual Event, Republic of Korea. ACM, New York, NY, USA, 9 pages.

Abstract

[Abstract]: Information Retrieval is an area where evaluation is crucial to validate newly proposed models. As the first step in the evaluation of models, researchers carry out offline experiments on specific datasets. While the field started around ad-hoc search, the number of new tasks is continuously growing. These tasks demand the development of new test collections (documents, information needs, and judgments). The construction of those datasets relies on expensive campaigns like TREC. Due to the size of modern collections, obtaining the relevance for each document-topic pair is infeasible. To reduce this cost, organizers usually apply a technique called pooling. When building pooled test collections, assessors only judge a portion of the documents selected among the participants' results. Although the judgments will not be exhaustive, they will be sufficiently complete and unbiased if pooling is done correctly. Therefore, researchers may safely use pooled collections to evaluate new models. However, the application of pooling depends on the existence of participant systems. This need is a handicap for tasks for which it is necessary to release training data before the celebration of the competition or for those with few participants. In this paper, we present a simple method for building pooled collections when such restrictions exist. Our proposal relies on two principles: the wisdom of the rankers and the application of pooling. By creating enough artificial participant systems, we can apply pooling on their results to select the documents that merit human assessment. Using an innovative approach to evaluate our method, we show that researchers may use it to produce high-quality collections on the absence of participant systems.

Description

This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing, https://doi.org/10.1145/3412841.3441947

Keywords

Information systems Information retrieval Evaluation of retrieval results Test collections Pooling

Editor version

https://doi.org/10.1145/3412841.3441947

Rights

Collections

Investigación (FIC)

Full item page

The wisdom of the rankers: a cost-effective method for building pooled test collections without participant systems

Files

Identifiers

Publication date

Authors

Advisors

Other responsabilities

Journal Title

Bibliographic citation

Type of academic work

Academic degree

Abstract

Description

Keywords

Editor version

Rights

Collections