• Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model 

      Andrade, Diego; Fraguela, Basilio B.; Doallo, Ramón (2018)
      [Abstract]: Cache performance is particularly hard to predict in modern multicore processors as several threads can be concurrently in execution, and private cache levels are combined with shared ones. This paper presents ...
    • Heterogeneous distributed computing based on high-level abstractions 

      Viñas Buceta, Moisés; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (2018)
      [Abstract]: The rise of heterogeneous systems has given place to great challenges for users as they involve new concepts, restrictions, and frameworks. Their exploitation is further complicated in the context of distributed ...
    • High Productivity Multi-device Exploitation with the Heterogeneous Programming Library 

      Viñas Buceta, Moisés; Fraguela, Basilio B.; Andrade, Diego; Doallo, Ramón (Elsevier, 2016)
      [Abstract] Heterogeneous devices require much more work from programmers than traditional CPUs, particularly when there are several of them, as each one has its own memory space. Multidevice applications require to distribute ...
    • High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn 

      Fraguela, Basilio B.; Andrade, Diego (Springer, 2021)
      [Abstract]: Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library ...
    • Modelado analítico del comportamiento de memorias caché 

      Fraguela, Basilio B. (1999)
      [Resumen] El principal cuello de botella que limita las tasas de computación que pueden alcanzar los sistemas actuales radica en la diferencia creciente de velocidad entre el procesador y las memorias, Para responder ...
    • Novel parallelization of simulated annealing and Hooke & Jeeves search algorithms for multicore systems with application to complex fisheries stock assessment models 

      Vázquez Pardo, Sergio; Martín, María J.; Fraguela, Basilio B.; Gómez, Andrés; Rodríguez, Aurelio; Elvarsson, Bjarki Þór (Elsevier Ltd, 2016-11)
      [Abstract] Estimating parameters of a statistical fisheries assessment model typically involves a comparison of disparate datasets to a forward simulation model through a likelihood function. In all but trivial cases the ...
    • Numerical Simulation of Pollutant Transport in a Shallow-Water System on the Cell Heterogeneous Processor 

      González, Carlos H.; Fraguela, Basilio B.; Andrade, Diego; García Rodríguez, José Antonio; Castro, M.J. (Springer, 2013)
      [Abstract] This paper presents an implementation, optimized for the Cell processor, of a finite volume numerical scheme for 2D shallow-water systems with pollutant transport. A description of the special architecture and ...
    • On processing extreme data 

      Petcu, Dana; Iuhasz, Gabriel; Pop, Daniel; Talia, Domenico; Carretero, Jesús; Prodan, Radu; Fahringer, Thomas; Grasso, Ivan; Doallo, Ramón; Martín, María J.; Fraguela, Basilio B.; Trobec, Roman; Depolli, Matjaz; Almeida Rodriguez, Francisco; Sande, Francisco de; Da Costa, Georges; Pierson, Jean-Marc; Anastasiadis, Stergios; Bartzokas, Aristides; Lolis, Christos; Gonçalves, Pedro; Brito, Fabrice; Brown, Nick (Universitatea de Vest din Timisoara,West University of Timisoara, 2016)
      [Abstract] Extreme Data is an incarnation of Big Data concept distinguished by the massive amounts of data that must be queried, communicated and analyzed in near real-time by using a very large number of memory or storage ...
    • OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (MDPI, 2021)
      [Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
    • Parallel Sparse Modified Gram-Schmidt QR Decomposition 

      Doallo, Ramón; Fraguela, Basilio B.; Touriño, Juan; Zapata, Emilio L. (Springer, 1996)
      [Abstract] We present a parallel computational method for the QR decomposition with column pivoting of a sparse matrix by means of Modified Gram-Schmidt orthogonalization. Nonzero elements of the matrix M to be decomposed ...
    • Parallelization of shallow water simulations on current multi-threaded systems 

      Lobeiras Blanco, Jacobo; Viñas Buceta, Moisés; Amor, Margarita; Fraguela, Basilio B.; Arenaz Silva, Manuel; García Rodríguez, José Antonio; Castro, M.J. (SAGE Journals, 2013-11)
      [Abstract]: In this work, several parallel implementations of a numerical model of pollutant transport on a shallow water system are presented. These parallel implementations are developed in two phases. First, the sequential ...
    • Performance Evaluation of MPI, UPC and OpenMP on Multicore Architectures 

      Mallón, Damián A.; Taboada, Guillermo L.; Teijeiro Barjas, Carlos; Touriño, Juan; Fraguela, Basilio B.; Gómez, Andrés; Doallo, Ramón; Mouriño, José C. (Springer, 2009)
      [Abstract] The current trend to multicore architectures underscores the need of parallelism. While new languages and alternatives for supporting more efficiently these systems are proposed, MPI faces this new challenge. ...
    • Performance Evaluation of Unified Parallel C Collective Communications 

      Taboada, Guillermo L.; Teijeiro Barjas, Carlos; Touriño, Juan; Fraguela, Basilio B.; Doallo, Ramón; Mouriño, José C.; Mallón, Damián A.; Gómez, Andrés (IEEE Computer Society, 2009-07-17)
      [Abstract] Unified Parallel C (UPC) is an extension of ANSI C designed for parallel programming. UPC collective primitives, which are part of the UPC standard, increase programming productivity while reducing the communication ...
    • Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library 

      Vázquez Pardo, Sergio; Amor, Margarita; Fraguela, Basilio B. (Elsevier, 2019-03)
      [Abstract]: The existence of a wide variety of computing devices with very different properties makes essential the development of software that is not only portable among them, but which also adapts to the properties of ...
    • Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (Institute of Electrical and Electronics Engineers, 2022)
      [Abstract]: The Deep Learning (DL) community found in pruning techniques a good way to reduce the models' resource and energy consumption. These techniques lead to smaller sparse models, but sparse computations in GPUs ...
    • ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems 

      Fraguela, Basilio B.; Andrade, Diego; González-Domínguez, Jorge (SpringerLink, 2021-03-19)
      [Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, ...
    • Servet: A Benchmark Suite for Autotuning on Multicore Clusters 

      González-Domínguez, Jorge; Taboada, Guillermo L.; Fraguela, Basilio B.; Martín, María J.; Touriño, Juan (Institute of Electrical and Electronics Engineers, 2010-05-24)
      [Abstract] MapReduce is a powerful tool for processing large data sets used by many applications running in distributed environments. However, despite the increasing number of computationally intensive problems that require ...
    • The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism 

      Fraguela, Basilio B.; Andrade, Diego (Springer, 2022)
      [Abstract] Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit ...
    • VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores 

      López Castro, Roberto; Ivanov, Andrei; Andrade, Diego; Ben-Nun, Tal; Fraguela, Basilio B.; Hoefler, Torsten (Association for Computing Machinery, 2023-11)
      [Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...