Buscar
Mostrando ítems 1-10 de 16
Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives
(Springer New York LLC, 2016-06)
[Abstract] The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming ...
A Novel Compiler Support for Automatic Parallelization on Multicore Systems
(Elsevier, 2013-09)
[Abstract] The widespread use of multicore processors is not a consequence of significant advances in parallel programming. In contrast, multicore processors arise due to the complexity of building power-efficient, ...
Affine Modeling of Program Traces
(Institute of Electrical and Electronics Engineers, 2019-02-01)
[Abstract] A formal, high-level representation of programs is typically needed for static and dynamic analyses performed by compilers. However, the source code of target applications is not always available in an analyzable ...
Analysis of Performance-impacting Factors on Checkpointing Frameworks: The CPPC Case Study
(Oxford University Press, 2011-11-01)
[Abstract] This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a tool for the checkpointing of parallel message-passing applications. Its performance and the factors that impact ...
A Heuristic Approach for the Automatic Insertion of Checkpoints in Message-Passing Codes
(Technische Universitaet Graz * Institut fuer Informationssysteme und Computer Medien,Graz University of Technology, Institute for Information Systems and Computer Media, 2009-08)
[Abstract] Checkpointing tools may be typically implemented at two different abstraction levels: at the system level or at the application level. The latter has become a more popular alternative due to its flexibility and ...
Improving Scalability of Application-Level Checkpoint-Recovery by Reducing Checkpoint Sizes
(Springer Japan KK, 2013)
[Abstract] The execution times of large-scale parallel applications on nowadays multi/many-core systems are usually longer than the mean time between failures. Therefore, parallel applications must tolerate hardware failures ...
Optimizing Coherence Traffic in Manycore Processors Using Closed-Form Caching/Home Agent Mappings
(Institute of Electrical and Electronics Engineers, 2021-02-09)
[Abstract]
Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories. A paramount ...
Volatile STT-RAM Scratchpad Design and Data Allocation for Low Energy
(Association for Computing Machinery, 2015)
[Abstract] On-chip power consumption is one of the fundamental challenges of current technology scaling. Cache memories consume a sizable part of this power, particularly due to leakage energy. STT-RAM is one of several ...
Parallel Hierarchical Radiosity on Hybrid Platforms
(Springer Verlag, 2011-12)
[Abstract] Achieving an efficient realistic illumination is an important aim of research in computer graphics. In this paper a new parallel global illumination method for hybrid systems based on the hierarchical radiosity ...
In-memory application-level checkpoint-based migration for MPI programs
(Springer New York LLC, 2014)
[Abstract] Process migration provides many benefits for parallel environments including dynamic load balancing, data access locality or fault tolerance. This paper describes an in-memory application-level checkpoint-based ...