• OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (MDPI, 2021)
      [Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
    • STuning-DL: Model-Driven Autotuning of Sparse GPU Kernels for Deep Learning 

      López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (Institute of Electrical and Electronics Engineers, 2024-05)
      [Abstract]: The relentless growth of modern Machine Learning models has spurred the adoption of sparsification techniques to simplify their architectures and reduce the computational demands. Network pruning has demonstrated ...