Buscar
Mostrando ítems 1-10 de 10
Locality-Aware Automatic Parallelization for GPGPU with OpenHMPP Directives
(Springer New York LLC, 2016-06)
[Abstract] The use of GPUs for general purpose computation has increased dramatically in the past years due to the rising demands of computing power and their tremendous computing capacity at low cost. Hence, new programming ...
A Novel Compiler Support for Automatic Parallelization on Multicore Systems
(Elsevier, 2013-09)
[Abstract] The widespread use of multicore processors is not a consequence of significant advances in parallel programming. In contrast, multicore processors arise due to the complexity of building power-efficient, ...
Affine Modeling of Program Traces
(Institute of Electrical and Electronics Engineers, 2019-02-01)
[Abstract] A formal, high-level representation of programs is typically needed for static and dynamic analyses performed by compilers. However, the source code of target applications is not always available in an analyzable ...
Analysis of Performance-impacting Factors on Checkpointing Frameworks: The CPPC Case Study
(Oxford University Press, 2011-11-01)
[Abstract] This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a tool for the checkpointing of parallel message-passing applications. Its performance and the factors that impact ...
A Heuristic Approach for the Automatic Insertion of Checkpoints in Message-Passing Codes
(Technische Universitaet Graz * Institut fuer Informationssysteme und Computer Medien,Graz University of Technology, Institute for Information Systems and Computer Media, 2009-08)
[Abstract] Checkpointing tools may be typically implemented at two different abstraction levels: at the system level or at the application level. The latter has become a more popular alternative due to its flexibility and ...
Optimizing Coherence Traffic in Manycore Processors Using Closed-Form Caching/Home Agent Mappings
(Institute of Electrical and Electronics Engineers, 2021-02-09)
[Abstract]
Manycore processors feature a high number of general-purpose cores designed to work in a multithreaded fashion. Recent manycore processors are kept coherent using scalable distributed directories. A paramount ...
Volatile STT-RAM Scratchpad Design and Data Allocation for Low Energy
(Association for Computing Machinery, 2015)
[Abstract] On-chip power consumption is one of the fundamental challenges of current technology scaling. Cache memories consume a sizable part of this power, particularly due to leakage energy. STT-RAM is one of several ...
Representing Integer Sequences Using Piecewise-Affine Loops
(MDPI, 2021)
[Abstract] A formal, high-level representation of programs is typically needed for static and dynamic analyses performed by compilers. However, the source code of target applications is not always available in an analyzable ...
CPPC: a compiler‐assisted tool for portable checkpointing of message‐passing applications
(John Wiley & Sons Ltd., 2010-11-19)
[Abstract] With the evolution of high‐performance computing toward heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities. Whether due to a failure in the ...
Compiler-Assisted Checkpointing of Parallel Codes: The Cetus and LLVM Experience
(Springer New York LLC, 2013)
[Abstract] With the evolution of high-performance computing, parallel applications have developed an increasing necessity for fault tolerance, most commonly provided by checkpoint and restart techniques. Checkpointing tools ...