dc.contributor.author | Silva, Jorge Miguel | |
dc.contributor.author | Almeida, João Rafael | |
dc.contributor.author | Oliveira, José Luís | |
dc.date.accessioned | 2023-11-22T09:05:21Z | |
dc.date.available | 2023-11-22T09:05:21Z | |
dc.date.issued | 2023 | |
dc.identifier.citation | J.M. Silva, J.R. Almeida, and J.L. Oliveira, "Classifying and discovering genomic sequences in metagenomic repositories", Procedia Computer Science, vol. 219, pp. 1501 - 1508, 2023. doi: 10.1016/j.procs.2023.01.441 | es_ES |
dc.identifier.uri | http://hdl.handle.net/2183/34310 | |
dc.description.abstract | [Abstract]: The taxonomic and functional composition of microbial communities from environmental, agricultural, and therapeutic settings is increasingly being studied using metagenomic methodologies in large-scale genomic applications. This has led to exponential growth in the field and has impacted on healthcare, pharmacology and biotechnology. However, with the current methodologies, it is sometimes difficult to obtain conclusive identification of an organism. In addition, the growth of the metagenomic field has led to the creation of large amounts of data held by different hosts, which characterize data differently and make analysis difficult. Therefore, correct data aggregation and classification improve and facilitate the discovery of repositories of interest. This paper tackles these issues by proposing a methodology for organism identification, data aggregation and content characterization, visualization and selection. We propose a three-step pipeline for organism identification that uses compression-based metrics, an aggregation mechanism for content characterization, and a web database catalogue for data exposition and visualization. | es_ES |
dc.description.sponsorship | This work has received funding from the EC under grant agreement 101081813, Genomic Data Infrastructure. J.M.S. and J.R.A are funded by the FCT - Foundation for Science and Technology (national funds) under the grants SFRH/BD/141851/2018 and SFRH/BD/147837/2019, respectively. | es_ES |
dc.description.sponsorship | Portugal. Fundação para a Ciência e a Tecnologia; SFRH/BD/141851/2018 | |
dc.description.sponsorship | Portugal. Fundação para a Ciência e a Tecnologia; SFRH/BD/147837/201 | |
dc.language.iso | eng | es_ES |
dc.publisher | Elsevier B.V. | es_ES |
dc.relation | info:eu-repo/grantAgreement/EC/HE/101081813 | es_ES |
dc.relation.uri | https://doi.org/10.1016/j.procs.2023.01.441 | es_ES |
dc.rights | Atribución-NoComercial-SinDerivadas 4.0 International (CC BY-NC-ND) | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | * |
dc.subject | Taxonomic Classification | es_ES |
dc.subject | Organism Identification | es_ES |
dc.subject | Compression | es_ES |
dc.subject | Web Portal | es_ES |
dc.subject | Data Aggregation | es_ES |
dc.subject | Genomic Catalogue | es_ES |
dc.title | Classifying and discovering genomic sequences in metagenomic repositories | es_ES |
dc.type | info:eu-repo/semantics/article | es_ES |
dc.rights.access | info:eu-repo/semantics/openAccess | es_ES |
UDC.journalTitle | Procedia Computer Science | es_ES |
UDC.volume | 219 | es_ES |
UDC.startPage | 1501 | es_ES |
UDC.endPage | 1508 | es_ES |
dc.identifier.doi | 10.1016/j.procs.2023.01.441 | |
UDC.conferenceTitle | 2022 International Conference on ENTERprise Information Systems, CENTERIS 2022 - International Conference on Project MANagement, ProjMAN 2022 and International Conference on Health and Social Care Information Systems and Technologies, HCist 2022, Lisbon 9-11 Nov. 2022 | es_ES |
UDC.coleccion | Investigación | |
UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | |
UDC.grupoInv | Redes de Neuronas Artificiais e Sistemas Adaptativos -Informática Médica e Diagnóstico Radiolóxico (RNASA - IMEDIR) | |