Mostrar o rexistro simple do ítem
Worst-Case-Optimal Similarity Joins on Graph Databases
dc.contributor.author | Arroyuelo, Diego | |
dc.contributor.author | Bustos, Benjamin | |
dc.contributor.author | Gómez-Brandón, Adrián | |
dc.contributor.author | Hogan, Aidan | |
dc.contributor.author | Navarro, Gonzalo | |
dc.contributor.author | Reutter, Juan | |
dc.date.accessioned | 2024-07-10T12:40:54Z | |
dc.date.available | 2024-07-10T12:40:54Z | |
dc.date.issued | 2024-03-06 | |
dc.identifier.citation | Diego Arroyuelo, Benjamin Bustos, Adrián Gómez-Brandón, Aidan Hogan, Gonzalo Navarro, and Juan Reutter. 2024. Worst-Case-Optimal Similarity Joins on Graph Databases. Proc. ACM Manag. Data 2, 1, Article 39 (February 2024), 26 pages. https://doi.org/10.1145/3639294 | es_ES |
dc.identifier.issn | 2836-6573 | |
dc.identifier.uri | http://hdl.handle.net/2183/37882 | |
dc.description | © ACM 2024. This is the author's version of the work (accepted manuscript or postprint). It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the ACM on Management of Data, https:// doi.org/10.1145/3639294 | es_ES |
dc.description.abstract | [Absctract]: We extend the concept of worst-case optimal equijoins in graph databases to the case where some nodes are required to be within the k-nearest neighbors (kNN) of others under some similarity function. We model the problem by superimposing the database graph with the kNN graph and show that a variant of Leapfrog TrieJoin (LTJ) implemented over a compact data structure called the Ring can be seamlessly extended to integrate similarity clauses with the equijoins in the LTJ query process, retaining worst-case optimality in many relevant cases. Our experiments on a benchmark that combines Wikidata and IMGpedia show that our enhanced LTJ algorithm outperforms by a considerable margin a baseline that first applies classic LTJ and then completes the query by applying the similarity predicates. The difference is more pronounced on queries where the similarity clauses are more densely connected to the query, becoming of an order of magnitude in some cases. | es_ES |
dc.description.sponsorship | Funded by ANID – Millennium Science Initiative Program – Code ICN17_002. A.G. is funded in part by MCIN/AEI/10.13039/5011000-11033: grant PID2020-114635RB-I00 (EXTRACompact); by MCIN/A-EI/10.13039/501100011033 and EU/ERDF "A way of making Europe": PID2022-141027NBC21 (EARTHDL); by MCIN/AEI/10.13039/501100011033 and “Next-GenerationEU”/ PRTR: grants TED2021-129245B-C21 (PLAGEMIS), PDC2021-120917-C21 (SIGTRANS) and by GAIN/Xunta de Galicia: GRC: grants ED431C 2021/53, and CIGUS 2023-2026. A.H. was funded in part by Fondecyt Grant 1221926. G.N. was funded in part by Fondecyt Grant 1-230755. J.R. was funded in part by Fondecyt Grant 1221799. | es_ES |
dc.description.sponsorship | Chile. Agencia Nacional de Investigación y Desarrollo; ICN17_002 | es_ES |
dc.description.sponsorship | Xunta de Galicia; ED431C 2021/53 | es_ES |
dc.description.sponsorship | Xunta de Galicia; CIGUS 2023-2026 | es_ES |
dc.description.sponsorship | Chile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1-230755 | es_ES |
dc.description.sponsorship | Chile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1221926 | es_ES |
dc.description.sponsorship | Chile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1221799 | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | Association for Computing Machinery (ACM) | es_ES |
dc.relation | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-141027NB-C21/ES/MODELADO, DESCUBRIMIENTO, EXPLORACION Y ANALISIS DE DATA LAKES MEDIOAMBIENTALES | es_ES |
dc.relation | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/TED2021-129245B-C21/ES/PLAGEMIS | es_ES |
dc.relation | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PDC2021-120917-C21/ES/SIGTRANS | es_ES |
dc.relation.uri | https://doi.org/10.1145/3639294 | es_ES |
dc.subject | Worst-case optimal joins | es_ES |
dc.subject | Leapfrog Triejoin | es_ES |
dc.subject | Graph patterns | es_ES |
dc.subject | Graph databases | es_ES |
dc.subject | Graph indexing | es_ES |
dc.subject | Similarity joins | es_ES |
dc.subject | Nearest-neighbor graphs | es_ES |
dc.title | Worst-Case-Optimal Similarity Joins on Graph Databases | es_ES |
dc.type | info:eu-repo/semantics/conferenceObject | es_ES |
dc.type | info:eu-repo/semantics/conferenceObject | es_ES |
dc.rights.access | info:eu-repo/semantics/openAccess | es_ES |
UDC.journalTitle | Proceedings of the ACM on Management of Data | es_ES |
UDC.volume | 2 | es_ES |
UDC.issue | 1 | es_ES |
UDC.startPage | Nº article: 39 | es_ES |
UDC.conferenceTitle | ACM SIGMOD/PODS International Conference on Management of Data | es_ES |