• Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model 

      González-Domínguez, Jorge; Kässens, Jan Christian; Wienbrandt, Lars; Schmidt, Bertil (Sage Publications Ltd., 2015)
      [Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, ...
    • Local Rollback for Resilient Mpi Applications With Application-Level Checkpointing and Message Logging 

      Losada, Nuria; Bosilca, George; Bouteiller, Aurelien; González, Patricia; Martín, María J. (Elsevier BV * North-Holland, 2019-02)
      [Abstract] The resilience approach generally used in high-performance computing (HPC) relies on coordinated checkpoint/restart, a global rollback of all the processes that are running the application. However, in many ...
    • Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives 

      Andión, José M.; Arenaz Silva, Manuel; Bodin, François; Rodríguez, Gabriel; Touriño, Juan (Springer New York LLC, 2016-06)
      [Abstract] The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming ...
    • Low‐latency Java communication devices on RDMA‐enabled networks 

      Expósito, Roberto R.; López Taboada, Guillermo; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2015)
      [Abstract] Providing high‐performance inter‐node communication is a key capability for running high performance computing applications efficiently on parallel architectures. In fact, current systems deployments are aggregating ...
    • MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud 

      Expósito, Roberto R.; Veiga, Jorge; González-Domínguez, Jorge; Touriño, Juan (Oxford University Press, 2017)
      [Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted ...
    • Mobile Robot Positioning with 433-MHz Wireless Motes with Varying Transmission Powers and a Particle Filter 

      Canedo-Rodríguez, Adrián; Rodríguez, José Manuel; Álvarez-Santos, Víctor; Iglesias, Roberto; Regueiro, Carlos V. (Multidisciplinary Digital Publishing Institute, 2015)
      In wireless positioning systems, the transmitter’s power is usually fixed. In this paper, we explore the use of varying transmission powers to increase the performance of a wireless localization system. To this extent, we ...
    • ModelTest-NG: A New and Scalable Tool for the Selection of DNA and Protein Evolutionary Models 

      Darriba, Diego; Posada, David; Kozlov, Alexey M.; Stamatakis, Alexandros; Morel, Benoit; Flouri, Tomas (Oxford University Press, 2019-08-21)
      [Abstract] ModelTest-NG is a reimplementation from scratch of jModelTest and ProtTest, two popular tools for selecting the best-fit nucleotide and amino acid substitution models, respectively. ModelTest-NG is one to two ...
    • MPI and UPC broadcast, scatter and gather algorithms in Xeon Phi 

      Mallón, Damián A.; López Taboada, Guillermo; Koesterke, Lars (John Wiley & Sons Ltd., 2016-05-06)
      [Abstract] Accelerators have revolutionised the high performance computing (HPC) community. Despite their advantages, their very specific programming models and limited communication capabilities have kept them in a ...
    • MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters 

      González-Domínguez, Jorge; Martín Martínez, José Manuel; Expósito, Roberto R. (Springer, 2022)
      [Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related ...
    • MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems 

      González-Domínguez, Jorge; Liu, Yongchao; Touriño, Juan; Schmidt, Bertil (Oxford University Press, 2016)
      [Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input ...
    • Multimethod optimization in the cloud: A case‐study in systems biology modelling 

      González, Patricia; Penas, David R.; Pardo, Xoán C.; Banga, Julio R.; Doallo, Ramón (Wiley, 2018-06-25)
      [Abstract] Optimization problems appear in many different applications in science and engineering. A large number of different algorithms have been proposed for solving them; however, there is no unique general optimization ...
    • Multithreaded and Spark parallelization of feature selection filters 

      Eiras-Franco, Carlos; Bolón-Canedo, Verónica; Ramos Garea, Sabela; González-Domínguez, Jorge; Alonso-Betanzos, Amparo; Touriño, Juan (2016)
      [Abstract]: Vast amounts of data are generated every day, constituting a volume that is challenging to analyze. Techniques such as feature selection are advisable when tackling large datasets. Among the tools that provide ...
    • Non-IID data and Continual Learning processes in Federated Learning: A long road ahead 

      Criado, Marcos F.; Casado, Fernando E.; Iglesias Rodríguez, Roberto; Regueiro, Carlos V.; Barro, Senén (Elsevier, 2022)
      [Abstract] Federated Learning is a novel framework that allows multiple devices or institutions to train a machine learning model collaboratively while preserving their data private. This decentralized approach is prone ...
    • Nonblocking collectives for scalable Java communications 

      Ramos Garea, Sabela; López Taboada, Guillermo; Expósito, Roberto R.; Touriño, Juan (John Wiley & Sons Ltd., 2015-04-22)
      [Abstract] This paper presents a Java implementation of the recently published MPI 3.0 nonblocking message passing collectives in order to analyze and assess the feasibility of taking advantage of these operations in shared ...
    • Novel parallelization of simulated annealing and Hooke & Jeeves search algorithms for multicore systems with application to complex fisheries stock assessment models 

      Vázquez Pardo, Sergio; Martín, María J.; Fraguela, Basilio B.; Gómez, Andrés; Rodríguez, Aurelio; Elvarsson, Bjarki Þór (Elsevier Ltd, 2016-11)
      [Abstract] Estimating parameters of a statistical fisheries assessment model typically involves a comparison of disparate datasets to a forward simulation model through a likelihood function. In all but trivial cases the ...
    • Numerical Simulation of Pollutant Transport in a Shallow-Water System on the Cell Heterogeneous Processor 

      González, Carlos H.; Fraguela, Basilio B.; Andrade, Diego; García Rodríguez, José Antonio; Castro, M.J. (Springer, 2013)
      [Abstract] This paper presents an implementation, optimized for the Cell processor, of a finite volume numerical scheme for 2D shallow-water systems with pollutant transport. A description of the special architecture and ...
    • On processing extreme data 

      Petcu, Dana; Iuhasz, Gabriel; Pop, Daniel; Talia, Domenico; Carretero, Jesús; Prodan, Radu; Fahringer, Thomas; Grasso, Ivan; Doallo, Ramón; Martín, María J.; Fraguela, Basilio B.; Trobec, Roman; Depolli, Matjaz; Almeida Rodriguez, Francisco; Sande, Francisco de; Da Costa, Georges; Pierson, Jean-Marc; Anastasiadis, Stergios; Bartzokas, Aristides; Lolis, Christos; Gonçalves, Pedro; Brito, Fabrice; Brown, Nick (Universitatea de Vest din Timisoara,West University of Timisoara, 2016)
      [Abstract] Extreme Data is an incarnation of Big Data concept distinguished by the massive amounts of data that must be queried, communicated and analyzed in near real-time by using a very large number of memory or storage ...
    • OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (MDPI, 2021)
      [Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
    • Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases 

      Veiga, Jorge; Expósito, Roberto R.; Raffin, Bruno; Touriño, Juan (Institute of Electrical and Electronics Engineers, 2018-11-12)
      [Abstract] Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although ...
    • Optimizing Coherence Traffic in Manycore Processors Using Closed-Form Caching/Home Agent Mappings 

      Kommrusch, Steve; Horro, Marcos; Pouchet, Louis-Noël; Rodríguez, Gabriel; Touriño, Juan (Institute of Electrical and Electronics Engineers, 2021-02-09)
      [Abstract] Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories. A paramount ...