Compressed and queryable self-indexes for RDF archives

Loading...
Thumbnail Image

Identifiers

Publication date

Authors

Fernández, Javier D.
Martínez-Prieto, Miguel A.

Advisors

Other responsabilities

Journal Title

Bibliographic citation

A. Cerdeira-Pena, G. de Bernardo, A. Fariña, J. D. Fernández, y M. A. Martínez-Prieto, «Compressed and queryable self-indexes for RDF archives», Knowl Inf Syst, ago. 2023, doi: 10.1007/s10115-023-01967-7.

Type of academic work

Academic degree

Abstract

[Abstract]: RDF compression and querying are consolidated topics in the Web of Data, with a plethora of solutions to efficiently store and query static datasets. However, as RDF data changes along time, it becomes necessary to keep different versions of RDF datasets, in what is called an RDF archive. For large RDF datasets, naive techniques to store these versions lead to significant scalability problems. In this paper, we present v-RDF-SI, one of the first RDF archiving solutions that aim at joining both compression and fast querying. In v-RDF-SI, we extend existing RDF representations based on compact data structures to provide efficient support of version-based queries in compressed space. We present two implementations of v-RDF-SI, named v-RDFCSA and v-HDT, based, respectively, on RDFCSA (an RDF self-index) and HDT (a W3C-supported compressed RDF representation). We experimentally evaluate v-RDF-SI over a public benchmark named BEAR, showing that v-RDF-SI drastically reduces space requirements, being up to 40 times smaller than the baselines provided by BEAR, and 4 times smaller than alternatives based on compact data structures, while yielding significantly faster query times in most cases. On average, the fastest variants of v-RDF-SI outperform the alternatives by almost an order of magnitude.

Description

This version of the article has been accepted for publication, after peer review and is subject to Knowledge and Information Systems , but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s10115-023-01967-7

Rights

© The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2023