Buscar
Mostrando ítems 1-7 de 7
A Software Cache Autotuning Strategy for Dataflow Computing with UPC++ DepSpawn
(Wiley, 2021)
[Abstract] Dataflow computing allows to start computations as soon as all their dependencies are satisfied. This is particularly useful in applications with irregular or complex patterns of dependencies which would otherwise ...
A Parallel Skeleton for Divide-and-conquer Unbalanced and Deep Problems
(Springer Nature, 2021)
[Abstract] The Divide-and-conquer (D&C) pattern appears in a large number of problems and is highly suitable to exploit parallelism. This has led to much research on its easy and efficient application both in shared and ...
A Highly Optimized Skeleton for Unbalanced and Deep Divide-And-Conquer Algorithms on Multi-Core Clusters
(Springer, 2022)
[Abstract] Efficiently implementing the divide-and-conquer pattern of parallelism in distributed memory systems is very relevant, given its ubiquity, and difficult, given its recursive nature and the need to exchange tasks ...
High-performance dataflow computing in hybrid memory systems with UPC++ DepSpawn
(Springer, 2021)
[Abstract]: Dataflow computing is a very attractive paradigm for high-performance computing, given its ability to trigger computations as soon as their inputs are available. UPC++ DepSpawn is a novel task-based library ...
ScalaParBiBit: Scaling the Binary Biclustering in Distributed-Memory Systems
(SpringerLink, 2021-03-19)
[Abstract] Biclustering is a data mining technique that allows us to find groups of rows and columns that are highly correlated in a 2D dataset. Although there exist several software applications to perform biclustering, ...
OpenCNN: A Winograd Minimal Filtering Algorithm Implementation in CUDA
(MDPI, 2021)
[Abstract] Improving the performance of the convolution operation has become a key target for High Performance Computing (HPC) developers due to its prevalence in deep learning applied mainly to video processing. The ...
An automatic optimizer for heterogeneous devices
(Elsevier, 2020-05)
[Abstract]: Codes written in a naive way seldom effectively exploit the computing resources, while writing optimized codes is usually a complex task that requires certain levels of expertise. This problem is further increased ...