Search
Now showing items 1-10 of 14
Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs
(Institute of Electrical and Electronics Engineers, 2022)
[Abstract]: The Deep Learning (DL) community found in pruning techniques a good way to reduce the models' resource and energy consumption. These techniques lead to smaller sparse models, but sparse computations in GPUs ...
The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism
(Springer, 2022)
[Abstract] Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit ...
RGen: Data Generator for Benchmarking Big Data Workloads
(MDPI, 2021)
[Abstract] This paper presents RGen, a parallel data generator for benchmarking Big Data workloads, which integrates existing features and new functionalities in a standalone tool. The main functionalities developed in ...
Hardware Implementation of Statecharts for FPGA-based Control in Scientific Facilities
(Institute of Electrical and Electronics Engineers, 2020-01-16)
[Abstract]
The problem of generating complex synchronization patterns using automated tools is addressed in this paper. This work was originally motivated by the need of fast and jitter free synchronization in scientific ...
Performance Optimization of a Parallel Error Correction Tool
(MDPI, 2021)
[Abstract] Due to the continuous development in the field of Next Generation Sequencing (NGS) technologies that have allowed researchers to take advantage of greater genetic samples in less time, it is a matter of relevance ...
An approach to support generic topologies in distributed PSO algorithms in Spark
(Armando De Giusti, Marcelo Naiouf, Franco Chichizola, Enzo Rucci, Laura De Giusti. Universidad Nacional de La Plata. Facultad de Informática., 2023)
[Abstract] Particle Swarm Optimization (PSO) is a popular population-based search algorithm that has been applied to all kinds of complex optimization problems. Although the performance of the algorithm strongly depends ...
PolyBench/Python: Benchmarking Python Environments With Polyhedral Optimizations
(Association for Computing Machinery, 2021-03)
[Abstract]
Python has become one of the most used and taught languages nowadays. Its expressiveness, cross-compatibility and ease of use have made it popular in areas as diverse as finance, bioinformatics or machine ...
Clupiter: a Raspberry Pi mini-supercomputer for educational purposes
(Institute of Electrical and Electronics Engineers, 2024)
[Abstract]: The main objective of this work is to bring supercomputing and parallel processing closer to non-specialized audiences by building a Raspberry Pi cluster, called Clupiter, which emulates the operation of a ...
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
(Association for Computing Machinery, 2023-11)
[Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...
Enabling Hardware Affinity in JVM-Based Applications: A Case Study for Big Data
(Springer, 2020)
[Abstract]: Java has been the backbone of Big Data processing for more than a decade due to its interesting features such as object orientation, cross-platform portability and good programming productivity. In fact, most ...