dc.contributor.author | Martínez-Sánchez, Marco | |
dc.contributor.author | Expósito, Roberto R. | |
dc.contributor.author | Touriño, Juan | |
dc.date.accessioned | 2022-01-20T18:17:24Z | |
dc.date.available | 2022-01-20T18:17:24Z | |
dc.date.issued | 2021 | |
dc.identifier.citation | Martínez-Sánchez, M.; Expósito, R.R.; Touriño, J. Performance Optimization of a Parallel Error Correction Tool. Eng. Proc. 2021, 7, 34. https://doi.org/10.3390/engproc2021007034 | es_ES |
dc.identifier.uri | http://hdl.handle.net/2183/29455 | |
dc.description | Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021. | es_ES |
dc.description.abstract | [Abstract] Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads. | es_ES |
dc.description.sponsorship | This research was funded by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00/AEI/10.13039/501100011033), and by Xunta de Galicia and FEDER funds of the European Union (Centro de Investigación de Galicia accreditation 2019-2022, ref. ED431G2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30) | es_ES |
dc.description.sponsorship | Xunta de Galicia; ED431G2019/01 | es_ES |
dc.description.sponsorship | Xunta de Galicia; ED431C 2021/30 | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | MDPI | es_ES |
dc.relation.uri | https://doi.org/10.3390/engproc2021007034 | es_ES |
dc.rights | Atribución 4.0 Internacional | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.subject | High performance computing | es_ES |
dc.subject | Big data | es_ES |
dc.subject | Bioinformatics | es_ES |
dc.subject | Next generation sequencing | es_ES |
dc.title | Performance Optimization of a Parallel Error Correction Tool | es_ES |
dc.type | conference output | es_ES |
dc.rights.accessRights | open access | es_ES |
UDC.journalTitle | Engineering Proceedings | es_ES |
UDC.volume | 7 | es_ES |
UDC.issue | 1 | es_ES |
UDC.startPage | 34 | es_ES |
dc.identifier.doi | 10.3390/engproc2021007034 | |
UDC.coleccion | Investigación | es_ES |
UDC.departamento | Enxeñaría de Computadores | es_ES |
UDC.grupoInv | Grupo de Arquitectura de Computadores (GAC) | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104184RB-I00/ES/DESAFIOS ACTUALES EN HPC: ARQUITECTURAS, SOFTWARE Y APLICACIONES | |