Buscar
Mostrando ítems 1-10 de 16
Communication avoiding and overlapping for numerical linear algebra
(IEEE Computer Society, 2013-02-25)
[Abstract] To efficiently scale dense linear algebra problems to future exascale systems, communication cost must be avoided or overlapped. Communication-avoiding 2.5D algorithms improve scalability by reducing inter-processor ...
Parallel Pairwise Epistasis Detection on Heterogeneous Computing Architectures
(Institute of Electrical and Electronics Engineers, 2016-08)
[Abstract] Development of new methods to detect pairwise epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important task in bioinformatics as they can help to explain genetic influences on ...
MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud
(Oxford University Press, 2017)
[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...
Acceleration of a Feature Selection Algorithm Using High Performance Computing
(MDPI AG, 2020-09-01)
[Abstract]
Feature selection is a subfield of data analysis that is on reducing the dimensionality of datasets, so that subsequent analyses over them can be performed in affordable execution times while keeping the same ...
A 2D algorithm with asymmetric workload for the UPC conjugate gradient method
(Springer New York LLC, 2014)
[Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. ...
HSRA: Hadoop-based spliced read aligner for RNA sequencing data
(Public Library of Science, 2018-07-31)
[Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference ...
MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems
(Oxford University Press, 2016)
[Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input ...
SMusket: Spark-based DNA error correction on distributed-memory systems
(Elsevier B.V., 2020)
[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction ...
Parallel feature selection for distributed-memory clusters
(2019)
[Abstract]: Feature selection is nowadays an extremely important data mining stage in the field of machine learning due to the appearance of problems of high dimensionality. In the literature there are numerous feature ...
Multithreaded and Spark parallelization of feature selection filters
(2016)
[Abstract]: Vast amounts of data are generated every day, constituting a volume that is challenging to analyze. Techniques such as feature selection are advisable when tackling large datasets. Among the tools that provide ...