Performance Optimization of a Parallel Error Correction Tool

Loading...
Thumbnail Image

Identifiers

Publication date

Authors

Martínez-Sánchez, Marco

Advisors

Other responsabilities

Journal Title

Bibliographic citation

Martínez-Sánchez, M.; Expósito, R.R.; Touriño, J. Performance Optimization of a Parallel Error Correction Tool. Eng. Proc. 2021, 7, 34. https://doi.org/10.3390/engproc2021007034

Type of academic work

Academic degree

Abstract

[Abstract] Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance to improve the existing algorithms aimed at the enhancement of the quality of those generated reads. In this work, we present a Big Data tool implemented upon the open-source Apache Spark framework that is able to execute validated error-correction algorithms at an improved performance. The experimental evaluation conducted on a multi-core cluster has shown significant improvements in execution times, providing a maximum speedup of 9.5 over existing error correction tools when processing an NGS dataset with 25 million reads.

Description

Presented at the 4th XoveTIC Conference, A Coruña, Spain, 7–8 October 2021.

Rights

Atribución 4.0 Internacional
Atribución 4.0 Internacional

Except where otherwise noted, this item's license is described as Atribución 4.0 Internacional