Efficient and compact representations of some non-canonical prefix-free codes

UDC.coleccionInvestigación
UDC.departamentoCiencias da Computación e Tecnoloxías da Información
UDC.endPage25
UDC.grupoInvLaboratorio de Bases de Datos (LBD)
UDC.institutoCentroCITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación
UDC.journalTitleTheoretical Computer Science
UDC.startPage11
UDC.volume907
dc.contributor.authorFariña, Antonio
dc.contributor.authorGagie, Travis
dc.contributor.authorGrabowski, Marcin
dc.contributor.authorManzini, Giovanni
dc.contributor.authorNavarro, Gonzalo
dc.contributor.authorOrdóñez, Alberto
dc.date.accessioned2025-11-05T13:54:41Z
dc.date.available2025-11-05T13:54:41Z
dc.date.issued2022-03-12
dc.description©2022 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0/. This version of the article has been accepted for publication in Theoretical Computer Science. The Version of Record is available online at https://doi.org/10.1016/j.tcs.2022.01.010 Versión final de: https://doi.org/10.1007/978-3-319-46049-9_5
dc.description.abstract[Abstract]: For many kinds of prefix-free codes there are efficient and compact alterna-tives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in which codewords are assigned to symbols. In this paper we first show how, given a probability distribution over an alphabet of σ symbols, we can store an optimal alphabetic prefix-free code in O(σ lg L) bits such that we can encode and decode any codeword of length  inO (min(, lg L)) time, where L is the maximum codeword length. With O 2L further bits, for any constant  > 0, we can encode and decode O(lg ) time. We then show how to store a nearly optimal alphabetic prefix-free code in o(σ) bits such that we can encode and decode in constant time. We also consider a kind of optimal prefix-free code introduced recently where the codewords’ lengths are non-decreasing if arranged in lexicographic order of their reverses. We reduce their storage space to O(σ lg L) while maintaining encoding and de-coding times in O(). We also show how, with O 2L further bits, we can encode and decode in constant time. All of our results hold in the word-RAM model.
dc.description.sponsorshipThis research was carried out in part at University of A Coruña, Spain, while the second author was visiting from the University of Helsinki and the sixth author was a PhD student there. It started at a StringMasters workshop at the Research Center on Information and Communication Technologies (CITIC) of the University of A Coruña. The workshop was funded in part by European Union's Horizon 2020 research and innovation programme under the Marie Sk lodowska-Curie grant agreement No 690941 (project BIRDS). The authors thank Nieves Brisaboa and Susana Ladra. The first author was supported by the CITIC research center funded by Xunta de Galicia/FEDER-UE 2014-2020 Program, grant CSI:ED431G 2019/01; by MICIU/FEDER-UE, grant BIZDEVOPSGLOBAL: RTI2018-098309-B-C32; and by Xunta de Galicia/FEDER-UE, ConectaPeme grant GEMA: IN852A 2018/14. The second author was supported by Academy of Finland grants 268324 and 250345 (CoECGR), Fondecyt Grant 1-171058, and NSERC grant RGPIN-07185-2020. The fourth author was supported by PRIN grant 2017WR7SHH, and by the INdAM-GNCS Project 2020 MFAIS- IoT. The fifth author was supported by Fondecyt Grant 1-200038, Basal Funds FB0001, and ANID - Millennium Science Initiative Program - Code ICN17 002, Chile.
dc.description.sponsorshipXunta de Galicia; ED431G 2019/01
dc.description.sponsorshipXunta de Galicia; IN852A 2018/14
dc.description.sponsorshipChile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1-171058
dc.description.sponsorshipChile. Fondo Nacional de Desarrollo Científico y Tecnológico (Fondecyt); 1-200038
dc.description.sponsorshipCanada. Natural Science and Engineering Research Council (NSERC); RGPIN-07185-2020
dc.description.sponsorshipItalia. Ministero dell'Istruzione, dell'Università e della Ricerca; 2017WR7SHH
dc.description.sponsorshipSuomen Akatemia; 268324
dc.description.sponsorshipSuomen Akatemia; 250345
dc.description.urihttp://hdl.handle.net/2183/18174
dc.identifier.citationA. Fariña, T. Gagie, S. Grabowski, G. Manzini, G. Navarro, and A. Ordóñez, "Efficient and compact representations of some non-canonical prefix-free codes", Theoretical Computer Science, Vol. 907, 12 Mar. 2022, pp. 11-25, https://doi.org/10.1016/j.tcs.2022.01.010
dc.identifier.doi10.1016/j.tcs.2022.01.010
dc.identifier.issn1879-2294
dc.identifier.urihttps://hdl.handle.net/2183/46282
dc.language.isoeng
dc.publisherElsevier
dc.relation.projectIDinfo:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-098309-B-C32/ES/BIZDEVOPS-GLOBAL: UN FRAMEWORK TECNOLOGICO Y METODOLOGICO SOSTENIBLE PARA EL DESARROLLO DE SOFTWARE ALINEADO CON EL NEGOCIO EN DEVOPS GLOBAL/
dc.relation.projectIDinfo:eu-repo/grantAgreement/EC/H2020/690941/EU
dc.relation.urihttps://doi.org/10.1016/j.tcs.2022.01.010
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internationalen
dc.rights.accessRightsopen access
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectCompact data structures
dc.subjectPrefix-free codes
dc.subjectAlphabetic codes
dc.subjectavelet matrix
dc.titleEfficient and compact representations of some non-canonical prefix-free codes
dc.typejournal article
dc.type.hasVersionAM
dspace.entity.typePublication
relation.isAuthorOfPublication2fe2b113-791f-4229-a83a-311d0c8b5ce6
relation.isAuthorOfPublication.latestForDiscovery2fe2b113-791f-4229-a83a-311d0c8b5ce6

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Farina_Antonio_2022_Efficient_and_compact_representations_of_some_non_canonical.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format