Envíos recentes

Mostrando ítems 6-10 de 55

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

López Castro, Roberto; Ivanov, Andrei; Andrade, Diego; Ben-Nun, Tal; Fraguela, Basilio B.; Hoefler, Torsten (Association for Computing Machinery, 2023-11)

[Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...
Clupiter: a Raspberry Pi mini-supercomputer for educational purposes

Rodríguez-Iglesias, Alonso; Martín, María J.; Touriño, Juan (Institute of Electrical and Electronics Engineers, 2024)

[Abstract]: The main objective of this work is to bring supercomputing and parallel processing closer to non-specialized audiences by building a Raspberry Pi cluster, called Clupiter, which emulates the operation of a ...
Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model

Andrade, Diego; Fraguela, Basilio B.; Doallo, Ramón (2018)

[Abstract]: Cache performance is particularly hard to predict in modern multicore processors as several threads can be concurrently in execution, and private cache levels are combined with shared ones. This paper presents ...
Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs

López Castro, Roberto; Andrade, Diego; Fraguela, Basilio B. (Institute of Electrical and Electronics Engineers, 2022)

[Abstract]: The Deep Learning (DL) community found in pruning techniques a good way to reduce the models' resource and energy consumption. These techniques lead to smaller sparse models, but sparse computations in GPUs ...
In-Transit Molecular Dynamics Analysis with Apache Flink

Zamúz, Henrique C.; Raffin, Bruno; Mures, Omar A.; Padrón, Emilio J. (Association for Computing Machinery (ACM), 2018-11)

[Abstract] In this paper, an on-line parallel analytics framework is proposed to process and store in transit all the data being generated by a Molecular Dynamics (MD) simulation run using staging nodes in the same cluster ...

GI-GAC - Congresos, conferencias, etc.: Envíos recentes

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores ﻿

Clupiter: a Raspberry Pi mini-supercomputer for educational purposes ﻿

Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model ﻿

Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs ﻿

In-Transit Molecular Dynamics Analysis with Apache Flink ﻿

VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores

Clupiter: a Raspberry Pi mini-supercomputer for educational purposes

Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model

Probing the Efficacy of Hardware-Aware Weight Pruning to Optimize the SpMM routine on Ampere GPUs

In-Transit Molecular Dynamics Analysis with Apache Flink