• A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library) 

      Valero-Lara, Pedro; Andrade, Diego; Sirvent, Raül; Labarta, Jesús; Fraguela, Basilio B.; Doallo, Ramón (Institute of Electrical and Electronics Engineers, 2019)
      [Abstract]: Many problems of industrial and scientific interest require the solving of tridiagonal linear systems. This paper presents several implementations for the parallel solving of large tridiagonal systems on ...
    • A Software Cache Autotuning Strategy for Dataflow Computing with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Wiley, 2021)
      [Abstract] Dataflow computing allows to start computations as soon as all their dependencies are satisfied. This is particularly useful in applications with irregular or complex patterns of dependencies which would otherwise ...
    • An automatic optimizer for heterogeneous devices 

      Fernández-Fabeiro, Jorge; Andrade, Diego; Fraguela, Basilio B.; Doallo, Ramón (Elsevier, 2020-05)
      [Abstract]: Codes written in a naive way seldom effectively exploit the computing resources, while writing optimized codes is usually a complex task that requires certain levels of expertise. This problem is further increased ...
    • Automated and accurate cache behavior analysis for codes with irregular access patterns 

      Andrade, Diego; Arenaz Silva, Manuel; Fraguela, Basilio B.; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2007-04-03)
      [Abstract] The memory hierarchy plays an essential role in the performance of current computers, so good analysis tools that help in predicting and understanding its behavior are required. Analytical modeling is the ideal ...
    • Developing adaptive multi-device applications with the Heterogeneous Programming Library 

      Viñas Buceta, Moisés; Bozkus, Zeki; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (Springer, 2015)
      [Abstract] The usage of heterogeneous devices presents two main problems. One is their complex programming, a problem that grows when multiple devices are used. The second issue is that even if the codes for these devices ...
    • Easy Dataflow Programming in Clusters with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Institute of Electrical and Electronics Engineers, 2019-06-01)
      [Abstract]: The Partitioned Global Address Space (PGAS) programming model is one of the most relevant proposals to improve the ability of developers to exploit distributed memory systems. However, despite its important ...
    • Facilitating the development of stencil applications using the Heterogeneous Programming Library 

      Viñas Buceta, Moisés; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (2017)
      [Abstract] Stencil computations are very common in scientific codes. Heterogeneous systems achieve good results solving these problems, but their programming is complex because of the ghost regions required in multi-device ...
    • Heterogeneous distributed computing based on high-level abstractions 

      Viñas Buceta, Moisés; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (2018)
      [Abstract]: The rise of heterogeneous systems has given place to great challenges for users as they involve new concepts, restrictions, and frameworks. Their exploitation is further complicated in the context of distributed ...
    • High Productivity Multi-device Exploitation with the Heterogeneous Programming Library 

      Viñas Buceta, Moisés; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (Elsevier, 2016)
      [Abstract] Heterogeneous devices require much more work from programmers than traditional CPUs, particularly when there are several of them, as each one has its own memory space. Multidevice applications require to distribute ...
    • High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Springer, 2021)
      [Abstract]: Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library ...
    • Numerical Simulation of Pollutant Transport in a Shallow-Water System on the Cell Heterogeneous Processor 

      González, Carlos H.; Fraguela, Basilio B.; Andrade, Diego; García Rodríguez, José Antonio; Castro, M.J. (Springer, 2013)
      [Abstract] This paper presents an implementation, optimized for the Cell processor, of a finite volume numerical scheme for 2D shallow-water systems with pollutant transport. A description of the special architecture and ...
    • OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (MDPI, 2021)
      [Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
    • ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems 

      Fraguela, Basilio B.; Andrade, Diego; González-Domínguez, Jorge (SpringerLink, 2021-03-19)
      [Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, ...