Space/time-efficient RDF stores based on circular suffix sorting
| UDC.coleccion | Investigación | es_ES |
| UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | es_ES |
| UDC.endPage | 5683 | es_ES |
| UDC.grupoInv | Laboratorio de Bases de Datos (LBD) | es_ES |
| UDC.journalTitle | The Journal of Supercomputing | es_ES |
| UDC.startPage | 5643 | es_ES |
| UDC.volume | 79 | es_ES |
| dc.contributor.author | Brisaboa, Nieves R. | |
| dc.contributor.author | Cerdeira-Pena, Ana | |
| dc.contributor.author | Bernardo, Guillermo de | |
| dc.contributor.author | Fariña, Antonio | |
| dc.contributor.author | Navarro, Gonzalo | |
| dc.date.accessioned | 2023-12-18T15:18:13Z | |
| dc.date.embargoEndDate | 2024-04-01 | es_ES |
| dc.date.embargoLift | 2024-04-01 | |
| dc.date.issued | 2023-03 | |
| dc.description | This version of the article has been accepted for publication, after peer review and is subject to Springer Nature’s AM terms of use, but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: https://doi.org/10.1007/s11227-022-04890-w | es_ES |
| dc.description.abstract | [Abstract]: The resource description framework (RDF) has gained popularity as a format for the standardized publication and exchange of information in the Web of Data. In this paper, we introduce RDFCSA, a compressed representation of RDF datasets that in addition supports efficient querying. RDFCSA regards the triples of the RDF store as short circular strings and applies suffix sorting on those strings, so that triple-pattern queries reduce to prefix searching on the string set. The RDF store is then represented compactly using a compressed suffix array (CSA), a proved technology in text indexing that efficiently supports prefix searches. Our experiments show that RDFCSA is competitive with state-of-the-art alternatives. It compresses the raw data to 60% of its size, close to the most compact alternatives. While most alternatives perform better in some kinds of triple-patterns than in others, RDFCSA features fast and consistent query times, a few microseconds per result in all cases. This enables efficiently supporting join queries by using either merge- or chaining-join strategies over the triple patterns coupled with some specific optimizations such as variable filling. Our experiments on binary joins show that RDFCSA is faster than the alternatives in most cases. | es_ES |
| dc.description.sponsorship | Funding for the Spanish group: projects funded by MCIN/ AEI/10.13039/501100011033: PDC2021-121239-C31 (FLATCITY-POC)-“NextGenerationEU”/PRTR; PDC2021-120917-C21 (SIGTRANS)-“NextGenerationEU”/PRTR; PID2020-114635RB-I00 (EXTRACompact); PID2019-105221RB-C41 (MAGIST); PID2021-122554OB-C33 (OASSIS-UDC); and TED2021-129245B-C21 (PLAGEMIS-UDC); grant ED431C 2021/53 (GRC) funded by GAIN/Xunta de Galicia; and grant ED431G 2019/01 (CSI) funded by Xunta de Galicia, FEDER Galicia 2014-2020 80%, SXU 20%; Gonzalo Navarro is partially funded by Fondecyt 1-200038, and by ANID - Millennium Science Initiative Program - Code ICN17 002. | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431C 2021/53 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431G 2019/01 | es_ES |
| dc.description.sponsorship | Chile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1-200038. | |
| dc.description.sponsorship | Chile. Agencia National de Investigación y Desarrollo; ICN17_002 | |
| dc.identifier.citation | Brisaboa, N.R., Cerdeira-Pena, A., de Bernardo, G. et al. Space/time-efficient RDF stores based on circular suffix sorting. J Supercomput 79, 5643–5683 (2023). https://doi.org/10.1007/s11227-022-04890-w | es_ES |
| dc.identifier.doi | 10.1007/s11227-022-04890-w | |
| dc.identifier.issn | 1573-0484 | |
| dc.identifier.uri | http://hdl.handle.net/2183/34535 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | Springer Nature | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2021-121239-C31/ES/FLATCITY-POC | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2021-120917-C21/ES/SIGTRANS | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-114635RB-I00/ES/EXPLOTACION ENRIQUECIDA DE TRAYECTORIAS CON ESTRUCTURAS DE DATOS COMPACTAS Y GIS/ | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-105221RB-C41/ES/VISUALIZACION Y EXPLORACION BASADA EN FLUJOS Y ANALITICA DE BIG DATA ESPACIAL | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122554OB-C33/ES/OASSIS-UDC: HACIA ORGANIZACIONES SOFTWARE MÁS SOSTENIBLES: UN ENFOQUE HOLÍSTICO PARA PROMOVER LA SOSTENIBILIDAD ECONÓMICA, HUMANA Y MEDIOAMBIENTAL | es_ES |
| dc.relation.uri | https://doi.org/10.1007/s11227-022-04890-w | es_ES |
| dc.rights | © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.subject | Compact data structures | es_ES |
| dc.subject | RDF | es_ES |
| dc.subject | CSA | es_ES |
| dc.subject | Web of data | es_ES |
| dc.title | Space/time-efficient RDF stores based on circular suffix sorting | es_ES |
| dc.type | journal article | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 42f2c226-9868-4516-8efd-2cd3c6692034 | |
| relation.isAuthorOfPublication | e09ccaa0-3a7f-4463-b6e7-db404361f097 | |
| relation.isAuthorOfPublication | 23354397-ec74-4cbb-93ac-f85352e9fbd8 | |
| relation.isAuthorOfPublication | 2fe2b113-791f-4229-a83a-311d0c8b5ce6 | |
| relation.isAuthorOfPublication.latestForDiscovery | 42f2c226-9868-4516-8efd-2cd3c6692034 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Brisaboa_Nieves_2023_Space_time_efficient_RDF_stores.pdf
- Size:
- 1.93 MB
- Format:
- Adobe Portable Document Format
- Description:
- Accepted version

