MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

UDC.coleccion	Investigación	es_ES
UDC.departamento	Enxeñaría de Computadores	es_ES
UDC.endPage	2764	es_ES
UDC.grupoInv	Grupo de Arquitectura de Computadores (GAC)	es_ES
UDC.issue	17	es_ES
UDC.journalTitle	Bioinformatics	es_ES
UDC.startPage	2762	es_ES
UDC.volume	33	es_ES
dc.contributor.author	Expósito, Roberto R.
dc.contributor.author	Veiga, Jorge
dc.contributor.author	González-Domínguez, Jorge
dc.contributor.author	Touriño, Juan
dc.date.accessioned	2018-07-04T14:48:36Z
dc.date.embargoEndDate	2018-09-02	es_ES
dc.date.embargoLift	2018-09-02
dc.date.issued	2017
dc.description	This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Roberto R. Expósito, Jorge Veiga, Jorge González-Domínguez, Juan Touriño; MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud, Bioinformatics, Volume 33, Issue 17, 1 September 2017, Pages 2762–2764 is available online at: https://doi.org/10.1093/bioinformatics/btx307	es_ES
dc.description.abstract	[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool.	es_ES
dc.description.sponsorship	Ministerio de Economia y Competitividad; TIN2016-75845-P	es_ES
dc.description.sponsorship	Ministerio de Educación; FPU014/02805	es_ES
dc.identifier.citation	Roberto R. Expósito, Jorge Veiga, Jorge González-Domínguez, Juan Touriño; MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud, Bioinformatics, Volume 33, Issue 17, 1 September 2017, Pages 2762–2764, https://doi.org/10.1093/bioinformatics/btx307	es_ES
dc.identifier.doi	10.1093/bioinformatics/btx307
dc.identifier.issn	1367-4803
dc.identifier.issn	1367-4811
dc.identifier.uri	http://hdl.handle.net/2183/20848
dc.language.iso	eng	es_ES
dc.publisher	Oxford University Press	es_ES
dc.relation.uri	https://doi.org/10.1093/bioinformatics/btx307	es_ES
dc.rights.accessRights	open access	es_ES
dc.subject	MarDRe	es_ES
dc.subject	Apache Hadoop	es_ES
dc.subject	Big Data	es_ES
dc.subject	Cloud platform	es_ES
dc.subject	MapReduce	es_ES
dc.subject	Cloud-ready duplicate	es_ES
dc.title	MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud	es_ES
dc.type	journal article	es_ES
dspace.entity.type	Publication
relation.isAuthorOfPublication	6a6967e9-a4f5-4006-afee-4fc9d5f3a658
relation.isAuthorOfPublication	0ef9135c-b7c9-48f1-8f06-55c025236916
relation.isAuthorOfPublication	84d13059-7f4b-4cb5-ac65-0e07a77271f0
relation.isAuthorOfPublication	86e306a5-99a1-4c43-8faa-720f0a9f0a34
relation.isAuthorOfPublication.latestForDiscovery	6a6967e9-a4f5-4006-afee-4fc9d5f3a658

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Expósito_R.R._MarDRe_efficient_MapReduce-based_removal_2017.pdf
Size:: 92.37 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Investigación (FIC)