The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra Space

UDC.coleccionInvestigaciónes_ES
UDC.departamentoCiencias da Computación e Tecnoloxías da Informaciónes_ES
UDC.grupoInvLaboratorio de Bases de Datos (LBD)es_ES
UDC.issue2es_ES
UDC.journalTitleACM Transactions on Database Systemses_ES
UDC.volume49es_ES
dc.contributor.authorArroyuelo, Diego
dc.contributor.authorGómez-Brandón, Adrián
dc.contributor.authorHogan, Aidan
dc.contributor.authorNavarro, Gonzalo
dc.contributor.authorReutter, Juan
dc.contributor.authorRojas-Ledesma, Javiel
dc.contributor.authorSoto, Adrián
dc.date.accessioned2024-07-10T08:43:58Z
dc.date.available2024-07-10T08:43:58Z
dc.date.issued2024-03-23
dc.description© ACM 2024. This is the author's version of the work (accepted manuscript or postprint). It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Database Systems, https://doi.org/10.1145/3644824es_ES
dc.description.abstract[Absctract]: We present an indexing scheme for triple-based graphs that supports join queries in worst-case optimal (wco) time within compact space. This scheme, called a ring, regards each triple as a cyclic string of length 3. Each rotation of the triples is lexicographically sorted and the values of the last attribute are stored as a column, so we obtain the order of the next column by stably re-sorting the triples by its attribute. We show that, by representing the columns with a compact data structure called a wavelet tree, this ordering enables forward and backward navigation between columns without needing pointers. These wavelet trees further support wco join algorithms and cardinality estimations for query planning. While traditional data structures such as B-Trees, tries, and so on, require 6 index orders to support all possible wco joins over triples, we can use one ring to index them all. This ring replaces the graph and uses only sublinear extra space, thus supporting wco joins in almost no space beyond storing the graph itself. Experiments querying a large graph (Wikidata) in memory show that the ring offers nearly the best overall query times while using only a small fraction of the space required by several state-of-the-art approaches. We then turn our attention to some theoretical results for indexing tables of arity d higher than 3 in such a way that supports wco joins. While a single ring of length d no longer suffices to cover all d! orders, we need much fewer rings to index them all: O(2d) rings with a small constant. For example, we need 5 rings instead of 120 orders for d=5. We show that our rings become a particular case of what we dub order graphs, whose nodes are attribute orders and where stably sorting by some attribute leads us from an order to another, thereby inducing an edge labeled by the attribute. The index is then the set of columns associated with the edges, and a set of rings is just one possible graph shape. We show that other shapes, like for example a single ring instead of several ones of length d, can lead us to even smaller indexes, and that other more general shapes are also possible. For example, we handle d=5 attributes within space equivalent to 4 rings.es_ES
dc.description.sponsorshipWe thank Amine Mhedhbi for help with Graphflow and Daniela Campos for help with CompactLTJ. This work was supported by ANID – Millennium Science Initiative Program – Code ICN17_002. Gómez-Brandón was supported in part by MCIN/AEI/10.13039/5011000-11033: grants PID2020- 114635RB-I00; by MCIN/AEI/10.13039/501100011033 and EU/ERDF "A way of making Europe": PID2022-141027NB-C21; by MCIN/AEI/10.13039/501100011033 and “Next-GenerationEU”/ PRTR: grants TED2021-129245B-C21, PDC2021-120917-C21 and by GAIN/ Xunta de Galicia: GRC: grants ED431C 2021/53, and CIGUS 2023-2026. Hogan was supported by FONDECYT Grant No. 1221926. Navarro was supported by FONDECYT Grant No. 1230755. Reutter was supported by FONDECYT Grant No. 1221799.es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2021/53es_ES
dc.description.sponsorshipXunta de Galicia; CIGUS 2023-2026es_ES
dc.description.sponsorshipChile. Agencia Nacional de Investigación y Desarrollo; ICN17_002es_ES
dc.description.sponsorshipChile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1221926es_ES
dc.description.sponsorshipChile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1230755es_ES
dc.description.sponsorshipChile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1221799es_ES
dc.identifier.citationDiego Arroyuelo, Adrián Gómez-Brandón, Aidan Hogan, Gonzalo Navarro, Juan Reutter, Javiel Rojas-Ledesma, and Adrián Soto. 2024. The Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra Space. ACM Trans. Database Syst. 49, 2, Article 5 (June 2024), 45 pages. https://doi.org/10.1145/3644824es_ES
dc.identifier.issn0362-5915
dc.identifier.issn1557-4644
dc.identifier.urihttp://hdl.handle.net/2183/37862
dc.language.isoenges_ES
dc.publisherAssociation for Computing Machinery (ACM)es_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-114635RB-I00/ES/EXPLOTACIÓN ENRIQUECIDA DE TRAYECTORIAS CON ESTRUCTURAS DE DATOS COMPACTAS Y GISes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-141027NB-C21/ES/MODELADO, DESCUBRIMIENTO, EXPLORACION Y ANALISIS DE DATA LAKES MEDIOAMBIENTALESes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/TED2021-129245B-C21/ES/PLAGEMISes_ES
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2021-120917-C21/ES/SIGTRANSes_ES
dc.relation.urihttps://doi.org/10.1145/364482es_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectWorst-case optimal joinses_ES
dc.subjectGraph patternses_ES
dc.subjectGraph databaseses_ES
dc.subjectGraph indexinges_ES
dc.subjectColumn storeses_ES
dc.subjectWavelet treeses_ES
dc.titleThe Ring: Worst-case Optimal Joins in Graph Databases using (Almost) No Extra Spacees_ES
dc.typejournal articlees_ES
dspace.entity.typePublication
relation.isAuthorOfPublication1a99c615-806a-48b0-8e5f-7772467f275d
relation.isAuthorOfPublication.latestForDiscovery1a99c615-806a-48b0-8e5f-7772467f275d

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Arroyuelo_Diego_2024_The_ring_worst_case_optimal_joins_in_graph_db_space.pdf
Size:
1.01 MB
Format:
Adobe Portable Document Format
Description:
Accepted Manuscript