Buscar
Mostrando ítems 1-10 de 22
Resilient MPI applications using an application-level checkpointing framework and ULFM
(Springer New York LLC, 2017-01)
[Abstract] Future exascale systems, formed by millions of cores, will present high failure rates, and long-running applications will need to make use of new fault tolerance techniques to ensure successful execution completion. ...
Analysis of Performance-impacting Factors on Checkpointing Frameworks: The CPPC Case Study
(Oxford University Press, 2011-11-01)
[Abstract] This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a tool for the checkpointing of parallel message-passing applications. Its performance and the factors that impact ...
Improving Scalability of Application-Level Checkpoint-Recovery by Reducing Checkpoint Sizes
(Springer Japan KK, 2013)
[Abstract] The execution times of large-scale parallel applications on nowadays multi/many-core systems are usually longer than the mean time between failures. Therefore, parallel applications must tolerate hardware failures ...
Enhanced global optimization methods applied to complex fisheries stock assessment models
(Elsevier BV, 2019-04)
[Abstract]
Statistical fisheries models are frequently used by researchers and agencies to understand the behavior of marine ecosystems or to estimate the maximum acceptable catch of different species of commercial interest. ...
Assessing resilient versus stop-and-restart fault-tolerant solutions in MPI applications
(Springer New York LLC, 2017-01)
[Abstract] The Message Passing Interface (MPI) standard is the most popular parallel programming model for distributed systems. However, it lacks fault-tolerance support and, traditionally, failures are addressed with ...
Novel parallelization of simulated annealing and Hooke & Jeeves search algorithms for multicore systems with application to complex fisheries stock assessment models
(Elsevier Ltd, 2016-11)
[Abstract] Estimating parameters of a statistical fisheries assessment model typically involves a comparison of disparate datasets to a forward simulation model through a likelihood function. In all but trivial cases the ...
Reducing the overhead of an MPI application-level migration approach
(Elsevier BV * North-Holland, 2016)
[Abstract] Process migration provides many benefits for parallel environments including dynamic load balance, data access locality, or fault tolerance. This work proposes a solution that reduces the memory and I/O overhead ...
Parallel definition of tear film maps on distributed-memory clusters for the support of dry eye diagnosis
(Elsevier Ireland Ltd., 2017)
[Abstract] Background and objectives
The analysis of the interference patterns on the tear film lipid layer is a useful clinical test to diagnose dry eye syndrome. This task can be automated with a high degree of accuracy ...
A 2D algorithm with asymmetric workload for the UPC conjugate gradient method
(Springer New York LLC, 2014)
[Abstract] This paper examines four different strategies, each one with its own data distribution, for implementing the parallel conjugate gradient (CG) method and how they impact communication and overall performance. ...
Local Rollback for Resilient Mpi Applications With Application-Level Checkpointing and Message Logging
(Elsevier BV * North-Holland, 2019-02)
[Abstract]
The resilience approach generally used in high-performance computing (HPC) relies on coordinated checkpoint/restart, a global rollback of all the processes that are running the application. However, in many ...