• A 2D algorithm with asymmetric workload for the UPC conjugate gradient method 

      González-Domínguez, Jorge; Marques, Osni A.; Martín, María J.; Touriño, Juan (Springer New York LLC, 2014)
      [Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. ...
    • A Software Cache Autotuning Strategy for Dataflow Computing with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Wiley, 2021)
      [Abstract] Dataflow computing allows to start computations as soon as all their dependencies are satisfied. This is particularly useful in applications with irregular or complex patterns of dependencies which would otherwise ...
    • High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Springer, 2021)
      [Abstract]: Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library ...
    • Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model 

      González-Domínguez, Jorge; Kässens, Jan Christian; Wienbrandt, Lars; Schmidt, Bertil (Sage Publications Ltd., 2015)
      [Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, ...
    • Parallel and Scalable Short-Read Alignment on Multi-Core Clusters Using UPC++ 

      Liu, Yongchao; Schmidt, Bertil; González-Domínguez, Jorge (Johannes Gutenberg University Mainz, 2016)
      [Abstract]: The growth of next-generation sequencing (NGS) datasets poses a challenge to the alignment of reads to reference genomes in terms of alignment quality and execution speed. Some available aligners have been shown ...
    • Parallel Brownian dynamics simulations with the message-passing and PGAS programming models 

      Teijeiro Barjas, Carlos; Sutmann, Godehard; Taboada, Guillermo L.; Touriño, Juan (Elsevier BV, 2013-04)
      [Abstract] The simulation of particle dynamics is among the most important mechanisms to study the behavior of molecules in a medium under specific conditions of temperature and density. Several models can be used to compute ...
    • Parallel simulation of Brownian dynamics on shared memory systems with OpenMP and Unified Parallel C 

      Teijeiro Barjas, Carlos; Sutmann, Godehard; Taboada, Guillermo L.; Touriño, Juan (Springer New York LLC, 2013-09)
      [Abstract] The simulation of particle dynamics is an essential method to analyze and predict the behavior of molecules in a given medium. This work presents the design and implementation of a parallel simulation of Brownian ...
    • parSRA: A framework for the parallel execution of short read aligners on compute clusters 

      González-Domínguez, Jorge; Hundt, Christian; Schmidt, Bertil (2018)
      [Abstract]: The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework ...
    • Performance Evaluation of Sparse Matrix Products in UPC 

      González-Domínguez, Jorge; García-López, Óscar; Taboada, Guillermo L.; Martín, María J.; Touriño, Juan (Springer New York LLC, 2013-04)
      [Abstract] Unified Parallel C (UPC) is a Partitioned Global Address Space (PGAS) language whose popularity has increased during the last years owing to its high programmability and reasonable performance through an efficient ...
    • Scalable PGAS collective operations in NUMA clusters 

      Mallón, Damián A.; Teijeiro Barjas, Carlos; González-Domínguez, Jorge; Taboada, Guillermo L.; Gómez, Andrés (Springer New York LLC, 2014-12)
      [Abstract] The increasing number of cores per processor is turning manycore-based systems in pervasive. This involves dealing with multiple levels of memory in non uniform memory access (NUMA) systems and processor cores ...
    • UPCBLAS: a library for parallel matrix computations in Unified Parallel C 

      González-Domínguez, Jorge; Martín, María J.; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón; Mallón, Damián A.; Wibecan, Brian (John Wiley & Sons Ltd., 2012-09-25)
      [Abstract] The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality, ...