• A multi-GPU shallow-water simulation with transport of contaminants 

      Viñas Buceta, Moisés; Lobeiras Blanco, Jacobo; Fraguela, Basilio B.; Arenaz Silva, Manuel; Amor, Margarita; García Rodríguez, José Antonio; Castro, M.J.; Doallo, Ramón (Wiley, 2012)
      [Abstract] This work presents cost-effective multi-graphics processing unit (GPU) parallel implementations of a finite-volume numerical scheme for solving pollutant transport problems in bidimensional domains. The fluid ...
    • Accelerating binary biclustering on platforms with CUDA-enabled GPUs 

      González-Domínguez, Jorge; Expósito, Roberto R. (Elsevier Ltd, 2018)
      [Abstract]: Data mining is nowadays essential in many scientific fields to extract valuable information from large input datasets and transform it into an understandable structure. For instance, biclustering techniques are ...
    • BPLG–BMCS: GPU-sorting algorithm using a tuning skeleton library 

      Pérez Diéguez, Adrián; Amor, Margarita; Doallo, Ramón (Springer New York LLC, 2017)
      [Abstract] In this work, we present an efficient and portable sorting operator for GPUs. Specifically, we propose an algorithmic variant of the bitonic merge sort which reduces the number of processing stages and internal ...
    • CUDA acceleration of MI-based feature selection methods 

      Beceiro, Bieito; González-Domínguez, Jorge; Morán-Fernández, Laura; Bolón-Canedo, Verónica; Touriño, Juan (Elsevier, 2024-08)
      [Abstract]: Feature selection algorithms are necessary nowadays for machine learning as they are capable of removing irrelevant and redundant information to reduce the dimensionality of the data and improve the quality of ...
    • CUDA-JMI: Acceleration of feature selection on heterogeneous systems 

      González-Domínguez, Jorge; Expósito, Roberto R.; Bolón-Canedo, Verónica (Elsevier, 2020-01)
      [Abstract]: Feature selection is a crucial step nowadays in machine learning and data analytics to remove irrelevant and redundant characteristics and thus to provide fast and reliable analyses. Many research works have ...
    • Efficient high-precision integer multiplication on the GPU 

      Pérez Diéguez, Adrián; Amor, Margarita; Doallo, Ramón; Nukada, Akira; Matsuoka, Satoshi (SAGE Journals, 2022-03)
      [Abstract]: The multiplication of large integers, which has many applications in computer science, is an operation that can be expressed as a polynomial multiplication followed by a carry normalization. This work develops ...
    • General‐purpose computation on GPUs for high performance cloud computing 

      Expósito, Roberto R.; Taboada, Guillermo L.; Ramos Garea, Sabela; Touriño, Juan; Doallo, Ramón (John Wiley & Sons Ltd., 2013-08)
      [Abstract] Cloud computing is offering new approaches for High Performance Computing (HPC) as it provides dynamically scalable resources as a service over the Internet. In addition, General‐Purpose computation on Graphical ...
    • GPU-accelerated exhaustive search for third-order epistatic interactions in case–control studies 

      González-Domínguez, Jorge; Schmidt, Bertil (Elsevier Ltd, 2015)
      [Abstract] Interest in discovering combinations of genetic markers from case–control studies, such as Genome Wide Association Studies (GWAS), that are strongly associated to diseases has increased in recent years. Detecting ...
    • Large-scale genome-wide association studies on a GPU cluster using a CUDA-accelerated PGAS programming model 

      González-Domínguez, Jorge; Kässens, Jan Christian; Wienbrandt, Lars; Schmidt, Bertil (Sage Publications Ltd., 2015)
      [Abstract] Detecting epistasis, such as 2-SNP interactions, in genome-wide association studies (GWAS) is an important but time consuming operation. Consequently, GPUs have already been used to accelerate these studies, ...
    • OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (MDPI, 2021)
      [Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
    • Speed and accuracy improvement of higher-order epistasis detection on CUDA-enabled GPUs 

      Jünger, Daniel; Hundt, Christian; González-Domínguez, Jorge; Schmidt, Bertil (Springer, 2017)
      [Abstract]: The discovery of higher-order epistatic interactions is an important task in the field of genome wide association studies which allows for the identification of complex interaction patterns between multiple ...
    • STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (Institute of Electrical and Electronics Engineers, 2024-05)
      [Abstract]: The relentless growth of modern Machine Learning models has spurred the adoption of sparsification techniques to simplify their architectures and reduce the computational demands. Network pruning has demonstrated ...
    • Tree Partitioning Reduction: A New Parallel Partition Method for Solving Tridiagonal Systems 

      Pérez Diéguez, Adrián; Amor, Margarita; Doallo, Ramón (Association for Computing Machinery (ACM), 2019-08)