• A 2D algorithm with asymmetric workload for the UPC conjugate gradient method 

      González-Domínguez, Jorge; Marques, Osni A.; Martín, María J.; Touriño, Juan (Springer New York LLC, 2014)
      [Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. ...
    • MPI and UPC broadcast, scatter and gather algorithms in Xeon Phi 

      Mallón, Damián A.; Taboada, Guillermo L.; Koesterke, Lars (John Wiley & Sons Ltd., 2016-05-06)
      [Abstract] Accelerators have revolutionised the high performance computing (HPC) community. Despite their advantages, their very specific programming models and limited communication capabilities have kept them in a ...
    • Parallel Brownian dynamics simulations with the message-passing and PGAS programming models 

      Teijeiro Barjas, Carlos; Sutmann, Godehard; Taboada, Guillermo L.; Touriño, Juan (Elsevier BV, 2013-04)
      [Abstract] The simulation of particle dynamics is among the most important mechanisms to study the behavior of molecules in a medium under specific conditions of temperature and density. Several models can be used to compute ...
    • Parallel simulation of Brownian dynamics on shared memory systems with OpenMP and Unified Parallel C 

      Teijeiro Barjas, Carlos; Sutmann, Godehard; Taboada, Guillermo L.; Touriño, Juan (Springer New York LLC, 2013-09)
      [Abstract] The simulation of particle dynamics is an essential method to analyze and predict the behavior of molecules in a given medium. This work presents the design and implementation of a parallel simulation of Brownian ...
    • Performance Evaluation of Sparse Matrix Products in UPC 

      González-Domínguez, Jorge; García-López, Óscar; Taboada, Guillermo L.; Martín, María J.; Touriño, Juan (Springer New York LLC, 2013-04)
      [Abstract] Unified Parallel C (UPC) is a Partitioned Global Address Space (PGAS) language whose popularity has increased during the last years owing to its high programmability and reasonable performance through an efficient ...
    • Scalable PGAS collective operations in NUMA clusters 

      Mallón, Damián A.; Teijeiro Barjas, Carlos; González-Domínguez, Jorge; Taboada, Guillermo L.; Gómez, Andrés (Springer New York LLC, 2014-12)
      [Abstract] The increasing number of cores per processor is turning manycore-based systems in pervasive. This involves dealing with multiple levels of memory in non uniform memory access (NUMA) systems and processor cores ...
    • UPCBLAS: a library for parallel matrix computations in Unified Parallel C 

      González-Domínguez, Jorge; Martín, María J.; Taboada, Guillermo L.; Touriño, Juan; Doallo, Ramón; Mallón, Damián A.; Wibecan, Brian (John Wiley & Sons Ltd., 2012-09-25)
      [Abstract] The popularity of Partitioned Global Address Space (PGAS) languages has increased during the last years thanks to their high programmability and performance through an efficient exploitation of data locality, ...