Mostrar o rexistro simple do ítem

dc.contributor.authorExpósito, Roberto R.
dc.contributor.authorGonzález-Domínguez, Jorge
dc.contributor.authorTouriño, Juan
dc.date.accessioned2023-12-18T11:00:12Z
dc.date.available2023-12-18T11:00:12Z
dc.date.issued2020
dc.identifier.citationR. R. Expósito, J. González-Domínguez, and J. Touriño, "SMusket: Spark-based DNA error correction on distributed-memory systems", Future Generation Computer Systems, vol. 111, pp. 698-713, 2020, https://doi.org/10.1016/j.future.2019.10.038es_ES
dc.identifier.urihttp://hdl.handle.net/2183/34529
dc.description©2020 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Future Generation Computer Systems. The Version of Record is available online at https://doi.org/10.1016/j.future.2019.10.038es_ES
dc.descriptionThis is the accepted version of: R. R. Expósito, J. González-Domínguez, and J. Touriño, "SMusket: Sparkbased DNA error correction on distributed-memory systems", Future Generation Computer Systems, vol. 111, pp. 698-713, 2020, https://doi.org/10.1016/j.future.2019.10.038es_ES
dc.description.abstract[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction in NGS datasets is considered an important preprocessing step in many workflows as sequencing errors can severely affect the quality of downstream analysis. Although current error correction approaches provide reasonably high accuracies, their computational cost can be still unacceptable when processing large datasets. In this paper we propose SparkMusket (SMusket), a Big Data tool built upon the open-source Apache Spark cluster computing framework to boost the performance of Musket, one of the most widely adopted and top-performing multithreaded correctors. Our tool efficiently exploits Spark features to implement a scalable error correction algorithm intended for distributed-memory systems built using commodity hardware. The experimental evaluation on a 16-node cluster using four publicly available datasets has shown that SMusket is up to 15.3 times faster than previous state-of-the-art MPI-based tools, also providing a maximum speedup of 29.8 over its multithreaded counterpart. SMusket is publicly available under an open-source license at https://github.com/rreye/smusketes_ES
dc.description.sponsorshipThis work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER, Spain funds of the European Union (project TIN2016-75845-P, AEI/FEDER/EU); and by Xunta de Galicia, Spain (projects ED431G/01 and ED431C 2017/04).es_ES
dc.description.sponsorshipXunta de galicia; ED431G/01es_ES
dc.description.sponsorshipXunta de Galicia; ED431C 2017/04es_ES
dc.language.isoenges_ES
dc.publisherElsevier B.V.es_ES
dc.relationinfo:eu-repo/grantAgreement/MINECO/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2016-75845-P/ES/NUEVOS DESAFIOS EN COMPUTACION DE ALTAS PRESTACIONES: DESDE ARQUITECTURAS HASTA APLICACIONES (II)/es_ES
dc.relation.isversionofhttps://doi.org/10.1016/j.future.2019.10.038
dc.relation.urihttps://doi.org/10.1016/j.future.2019.10.038es_ES
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 Españaes_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subjectNext-Generation Sequencing (NGS)es_ES
dc.subjectSequence analysises_ES
dc.subjectBig Dataes_ES
dc.subjectApache Sparkes_ES
dc.subjectError correctiones_ES
dc.titleSMusket: Spark-based DNA error correction on distributed-memory systemses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.rights.accessinfo:eu-repo/semantics/openAccesses_ES
UDC.journalTitleFuture Generation Computer Systemses_ES
UDC.volume111es_ES
UDC.startPage698es_ES
UDC.endPage713es_ES
dc.identifier.doi10.1016/j.future.2019.10.038


Ficheiros no ítem

Thumbnail
Thumbnail

Este ítem aparece na(s) seguinte(s) colección(s)

Mostrar o rexistro simple do ítem