Buscar
Mostrando ítems 21-30 de 37
Heterogeneous distributed computing based on high-level abstractions
(2018)
[Abstract]: The rise of heterogeneous systems has given place to great challenges for users as they involve new concepts, restrictions, and frameworks. Their exploitation is further complicated in the context of distributed ...
Guiding the Optimization of Parallel Codes on Multicores Using an Analytical Cache Model
(2018)
[Abstract]:
Cache performance is particularly hard to predict in modern multicore processors as several threads can be concurrently in execution, and private cache levels are combined with shared ones. This paper presents ...
The New UPC++ DepSpawn High Performance Library for Data-Flow Computing with Hybrid Parallelism
(Springer, 2022)
[Abstract] Data-flow computing is a natural and convenient paradigm for expressing parallelism. This is particularly true for tools that automatically extract the data dependencies among the tasks while allowing to exploit ...
Portable and efficient FFT and DCT algorithms with the Heterogeneous Butterfly Processing Library
(Elsevier, 2019-03)
[Abstract]: The existence of a wide variety of computing devices with very different properties makes essential the development of software that is not only portable among them, but which also adapts to the properties of ...
Easy Dataflow Programming in Clusters with UPC++ DepSpawn
(Institute of Electrical and Electronics Engineers, 2019-06-01)
[Abstract]: The Partitioned Global Address Space (PGAS) programming model is one of the most relevant proposals to improve the ability of developers to exploit distributed memory systems. However, despite its important ...
VENOM: A Vectorized N:M Format for Unleashing the Power of Sparse Tensor Cores
(Association for Computing Machinery, 2023-11)
[Abstract]: The increasing success and scaling of Deep Learning models demands higher computational efficiency and power. Sparsification can lead to both smaller models as well as higher compute efficiency, and accelerated ...
A Fast Solver for Large Tridiagonal Systems on Multi-Core Processors (Lass Library)
(Institute of Electrical and Electronics Engineers, 2019)
[Abstract]: Many problems of industrial and scientific interest require the solving of tridiagonal linear systems. This paper presents several implementations for the parallel solving of large tridiagonal systems on ...
Parallelization of shallow water simulations on current multi-threaded systems
(SAGE Journals, 2013-11)
[Abstract]: In this work, several parallel implementations of a numerical model of pollutant transport on a shallow water system are presented. These parallel implementations are developed in two phases. First, the sequential ...
Numerical Simulation of Pollutant Transport in a Shallow-Water System on the Cell Heterogeneous Processor
(Springer, 2013)
[Abstract] This paper presents an implementation, optimized for the Cell processor, of a finite volume numerical scheme for 2D shallow-water systems with pollutant transport. A description of the special architecture and ...
A multi-GPU shallow-water simulation with transport of contaminants
(Wiley, 2012)
[Abstract] This work presents cost-effective multi-graphics processing unit (GPU) parallel implementations of a finite-volume numerical scheme for solving pollutant transport problems in bidimensional domains. The fluid ...