Martínez-Sánchez, MarcoExpósito, Roberto R.Touriño, Juan2022-01-202022-01-202021Martínez-Sánchez, M.; Expósito, R.R.; Touriño, J. Performance Optimization of a Parallel Error Correction Tool. Eng. Proc. 2021, 7, 34. https://doi.org/10.3390/engproc2021007034http://hdl.handle.net/2183/29455Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.[Abstract] Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads.engAtribución 4.0 Internacionalhttp://creativecommons.org/licenses/by/4.0/High performance computingBig dataBioinformaticsNext generation sequencingPerformance Optimization of a Parallel Error Correction Toolconference outputopen access10.3390/engproc2021007034