Listar GI-GAC - Artigos por autor "González-Domínguez, Jorge"

A 2D algorithm with asymmetric workload for the UPC conjugate gradient method

González-Domínguez, Jorge; Marques, Osni A.; Martín, María J.; Touriño, Juan (Springer New York LLC, 2014)

[Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. ...

A SIMD Algorithm for the Detection of Epistatic Interactions of Any Order

Ponte-Fernández, Christian; González-Domínguez, Jorge; Martín, María J. (Elsevier, 2022)

[Abstract] Epistasis is a phenomenon in which a phenotype outcome is determined by the interaction of genetic variation at two or more loci and it cannot be attributed to the additive combination of effects corresponding ...

Accelerating binary biclustering on platforms with CUDA-enabled GPUs

González-Domínguez, Jorge; Expósito, Roberto R. (Elsevier Ltd, 2018)

[Abstract]: Data mining is nowadays essential in many scientific fields to extract valuable information from large input datasets and transform it into an understandable structure. For instance, biclustering techniques are ...

Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform

Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; González-Domínguez, Jorge; Touriño, Juan; Doallo, Ramón (Springer Netherlands, 2013-12)

[Abstract] Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well ...

Automatic mapping of parallel applications on multicore architectures using the Servet benchmark suite

González-Domínguez, Jorge; Taboada, Guillermo L.; Fraguela, Basilio B.; Martín, María J.; Touriño, Juan (Pergamon Press, 2012-03)

[Abstract] Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their ...

BigDEC: A multi-algorithm Big Data tool based on the k-mer spectrum method for scalable short-read error correction

Expósito, Roberto R.; González-Domínguez, Jorge (Elsevier, 2024-05)

[Abstract]: Despite the significant improvements in both throughput and cost provided by modern Next-Generation Sequencing (NGS) platforms, sequencing errors in NGS datasets can still degrade the quality of downstream ...

bioScience: A new python science library for high-performance computing bioinformatics analytics

López-Fernández, Aurelio; Gómez-Vela, Francisco A.; González-Domínguez, Jorge; Bidare-Divakarachari, Parameshachari (Elsevier Ltd, 2024)

[Abstract]: BioScience is an advanced Python library designed to satisfy the growing data analysis needs in the field of bioinformatics by leveraging High-Performance Computing (HPC). This library encompasses a vast multitude ...

CUDA acceleration of MI-based feature selection methods

Beceiro, Bieito; González-Domínguez, Jorge; Morán-Fernández, Laura; Bolón-Canedo, Verónica; Touriño, Juan (Elsevier, 2024-08)

[Abstract]: Feature selection algorithms are necessary nowadays for machine learning as they are capable of removing irrelevant and redundant information to reduce the dimensionality of the data and improve the quality of ...

CUDA-JMI: Acceleration of feature selection on heterogeneous systems

González-Domínguez, Jorge; Expósito, Roberto R.; Bolón-Canedo, Verónica (Elsevier, 2020-01)

[Abstract]: Feature selection is a crucial step nowadays in machine learning and data analytics to remove irrelevant and redundant characteristics and thus to provide fast and reliable analyses. Many research works have ...

Evaluation of Existing Methods for High-Order Epistasis Detection

Ponte-Fernández, Christian; González-Domínguez, Jorge; Carvajal-Rodriguez, Antonio; Martín, María J. (Institute of Electrical and Electronics Engineers, 2020-10-15)

[Abstract] Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the ...

Fast search of third-order epistatic interactions on CPU and GPU clusters

Ponte-Fernández, Christian; González-Domínguez, Jorge; Martín, María J. (Sage Publications Ltd., 2019-05-27)

[Abstract] Genome-Wide Association Studies (GWASs), analyses that try to find a link between a given phenotype (such as a disease) and genetic markers, have been growing in popularity in the recent years. Relations between ...

Fiuncho: a program for any-order epistasis detection in CPU clusters

Ponte-Fernández, Christian; González-Domínguez, Jorge; Martín, María J. (Springer, 2022)

[Abstract]: Epistasis can be defined as the statistical interaction of genes during the expression of a phenotype. It is believed that it plays a fundamental role in gene expression, as individual genetic variants have ...

GPU-accelerated exhaustive search for third-order epistatic interactions in case–control studies

González-Domínguez, Jorge; Schmidt, Bertil (Elsevier Ltd, 2015)

[Abstract] Interest in discovering combinations of genetic markers from case–control studies, such as Genome Wide Association Studies (GWAS), that are strongly associated to diseases has increased in recent years. Detecting ...

High-speed exhaustive 3-locus interaction epistasis analysis on FPGAs

Kässens, Jan Christian; Wienbrandt, Lars; González-Domínguez, Jorge; Schmidt, Bertil; Schimmler, Manfred (Elsevier B.V., 2015-07)

[Abstract]: Epistasis, the interaction between genes, has become a major topic in molecular and quantitative genetics. It is believed that these interactions play a significant role in genetic variations causing complex ...

HSRA: Hadoop-based spliced read aligner for RNA sequencing data

Expósito, Roberto R.; González-Domínguez, Jorge; Touriño, Juan (Public Library of Science, 2018-07-31)

[Abstract] Nowadays, the analysis of transcriptome sequencing (RNA-seq) data has become the standard method for quantifying the levels of gene expression. In RNA-seq experiments, the mapping of short reads to a reference ...

Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model

González-Domínguez, Jorge; Kässens, Jan Christian; Wienbrandt, Lars; Schmidt, Bertil (Sage Publications Ltd., 2015)

[Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, ...

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan (Oxford University Press, 2017)

[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...

MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters

González-Domínguez, Jorge; Martín Martínez, José Manuel; Expósito, Roberto R. (Springer, 2022)

[Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related ...

MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems

González-Domínguez, Jorge; Liu, Yongchao; Touriño, Juan; Schmidt, Bertil (Oxford University Press, 2016)

[Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input ...

Multithreaded and Spark parallelization of feature selection filters

Eiras-Franco, Carlos; Bolón-Canedo, Verónica; Ramos Garea, Sabela; González-Domínguez, Jorge; Alonso-Betanzos, Amparo; Touriño, Juan (2016)

[Abstract]: Vast amounts of data are generated every day, constituting a volume that is challenging to analyze. Techniques such as feature selection are advisable when tackling large datasets. Among the tools that provide ...