Buscar

Mostrando ítems 1-5 de 5

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan (Oxford University Press, 2017)

[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...

HSRA: Hadoop-based spliced read aligner for RNA sequencing data

Expósito, Roberto R.; González-Domínguez, Jorge; Touriño, Juan (Public Library of Science, 2018-07-31)

[Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference ...

SMusket: Spark-based DNA error correction on distributed-memory systems

Expósito, Roberto R.; González-Domínguez, Jorge; Touriño, Juan (Elsevier B.V., 2020)

[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction ...

Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform

Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; González-Domínguez, Jorge; Touriño, Juan; Doallo, Ramón (Springer Netherlands, 2013-12)

[Abstract] Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well ...

The Servet 3.0 benchmark suite: characterization of network performance degradation

González-Domínguez, Jorge; Martín, María J.; Taboada, Guillermo L.; Expósito, Roberto R.; Touriño, Juan (Pergamon Press, 2013-11)

[Abstract] Servet is a suite of benchmarks focused on extracting a set of parameters with high influence on the overall performance of multicore clusters. These parameters can be used to optimize the performance of parallel ...