Performance Optimization of a Parallel Error Correction Tool

Use este enlace para citar
http://hdl.handle.net/2183/29455
A non ser que se indique outra cousa, a licenza do ítem descríbese como Atribución 4.0 Internacional
Coleccións
- Investigación (FIC) [1636]
Metadatos
Mostrar o rexistro completo do ítemTítulo
Performance Optimization of a Parallel Error Correction ToolData
2021Cita bibliográfica
Martínez-Sánchez, M.; Expósito, R.R.; Touriño, J. Performance Optimization of a Parallel Error Correction Tool. Eng. Proc. 2021, 7, 34. https://doi.org/10.3390/engproc2021007034
Resumo
[Abstract] Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads.
Palabras chave
High performance computing
Big data
Bioinformatics
Next generation sequencing
Big data
Bioinformatics
Next generation sequencing
Descrición
Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.
Versión do editor
Dereitos
Atribución 4.0 Internacional