Zombit: Exploiting Runs in Bitvectors

UDC.coleccionInvestigación
UDC.departamentoCiencias da Computación e Tecnoloxías da Información
UDC.endPage18
UDC.grupoInvLaboratorio de Bases de Datos (LBD)
UDC.institutoCentroCITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación
UDC.journalTitleSoftware: Practice and Experience
UDC.startPage1
UDC.volume2025
dc.contributor.authorGómez-Brandón, Adrián
dc.date.accessioned2025-09-24T11:16:36Z
dc.date.available2025-09-24T11:16:36Z
dc.date.issued2025-09-06
dc.descriptionThe data that support the findings of this study are openly available in zombit-vector at https://github.com/adriangbrandon/zombit-vector. Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG
dc.description.abstract[Abstract]: Introduction: Bitvectors are a fundamental building block of many compact data structures. In this work, we propose a new compressed representation for bitvectors, named zombit-vector, which compresses bitvectors with 𝑘 runs by splitting the bitvector into fixed-length blocks and classifying them into three possible types depending on their content. Methods: It supports the typical operations over bitvectors in 𝑂(1) time, but select in 𝑂(log 𝑛) time. The total space required for this structure is 𝑂(√𝑘𝑛) + 𝑜(√𝑘𝑛) bits. In addition, we introduce an extension pzombit-vector where the blocks have variablelength. The variable-length partitioning adapts better to the distribution of the data, reducing space needs but requiring 𝑂(log 𝑛) time for all operations. Results: We include experiments on synthetic data and over two real-world scenarios. The synthetic scenario shows the competitiveness of our techniques against well-known state-of-the-art alternatives. That experiment confirms that pzombit-vector is able to use one-tenth of zombit-vector, but becomes 5 times slower. Our experiments with real applications demonstrate that the proposed techniques are practical and applicable in real-world scenarios.
dc.description.sponsorshipFunding: Ministerio de Ciencia e Innovación, Grant/Award Nos: PID2020-114635RB-I00; TED2021-129245B-C21; PID2022-141027NB-C21; Xunta de Galicia, Grant/Award No: ED431C 2025/34; and CITIC receives subsidies from Xunta de Galicia and FEDER Galicia (Ref. ED431G 2023/01). Funding for open access charge: Universidade da Coruña/CISUG.
dc.description.sponsorshipXunta de Galicia; ED431C 2025/34
dc.description.sponsorshipXunta de Galicia; ED431G 2023/01
dc.identifier.citationA. Gómez-Brandón, "Zombit: Exploiting Runs in Bitvectors", Software: Practice and Experience, 2025, pp. 1–18, https://doi.org/10.1002/spe.70019
dc.identifier.doi10.1002/spe.70019
dc.identifier.issn1097-024X
dc.identifier.issn0038-0644
dc.identifier.urihttps://hdl.handle.net/2183/45808
dc.language.isoeng
dc.publisherWiley
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-114635RB-I00/ES/EXPLOTACION ENRIQUECIDA DE TRAYECTORIAS CON ESTRUCTURAS DE DATOS COMPACTAS Y GIS/
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/TED2021-129245B-C21/ES/PLATAFORMA PARA LA GENERACIÓN AUTOMÁTICA DE SISTEMAS DE INFORMACIÓN DE LA MOVILIDAD ENERGÉTICAMENTE EFICIENTES, BASADOS EN ESTRUCTURAS DE DATOS COMPACTAS Y GIS (PLAGEMIS)
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2022-141027NB-C21/ES/MODELADO, DESCUBRIMIENTO, EXPLORACION Y ANALISIS DE DATA LAKES MEDIOAMBIENTALES
dc.relation.urihttps://doi.org/10.1002/spe.70019
dc.rightsAttribution 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectbitvectors
dc.subjectcompact data structures
dc.subjectbitvectors; compact data structures; compression; indices
dc.subjectindices
dc.titleZombit: Exploiting Runs in Bitvectors
dc.typejournal article
dc.type.hasVersionVoR
dspace.entity.typePublication
relation.isAuthorOfPublication1a99c615-806a-48b0-8e5f-7772467f275d
relation.isAuthorOfPublication.latestForDiscovery1a99c615-806a-48b0-8e5f-7772467f275d

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GomezBrandon_Adrian_2025_Zombit.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format