Buscar
Mostrando ítems 11-18 de 18
An automatic optimizer for heterogeneous devices
(Elsevier, 2020-05)
[Abstract]: Codes written in a naive way seldom effectively exploit the computing resources, while writing optimized codes is usually a complex task that requires certain levels of expertise. This problem is further increased ...
Heterogeneous distributed computing based on high-level abstractions
(2018)
[Abstract]: The rise of heterogeneous systems has given place to great challenges for users as they involve new concepts, restrictions, and frameworks. Their exploitation is further complicated in the context of distributed ...
Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model
(2018)
[Abstract]:
Cache performance is particularly hard to predict in modern multicore processors as several threads can be concurrently in execution, and private cache levels are combined with shared ones. This paper presents ...
The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism
(Springer, 2022)
[Abstract] Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit ...
Easy Dataflow Programming in Clusters with UPC++ DepSpawn
(Institute of Electrical and Electronics Engineers, 2019-06-01)
[Abstract]: The Partitioned Global Address Space (PGAS) programming model is one of the most relevant proposals to improve the ability of developers to exploit distributed memory systems. However, despite its important ...
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
(Association for Computing Machinery, 2023-11)
[Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...
A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)
(Institute of Electrical and Electronics Engineers, 2019)
[Abstract]: Many problems of industrial and scientific interest require the solving of tridiagonal linear systems. This paper presents several implementations for the parallel solving of large tridiagonal systems on ...
Numerical Simulation of Pollutant Transport in a Shallow-Water System on the Cell Heterogeneous Processor
(Springer, 2013)
[Abstract] This paper presents an implementation, optimized for the Cell processor, of a finite volume numerical scheme for 2D shallow-water systems with pollutant transport. A description of the special architecture and ...