Buscar
Mostrando ítems 1-10 de 18
Resilient MPI applications using an application-level checkpointing framework and ULFM
(Springer New York LLC, 2017-01)
[Abstract] Future exascale systems, formed by millions of cores, will present high failure rates, and long-running applications will need to make use of new fault tolerance techniques to ensure successful execution completion. ...
Analysis of Performance-impacting Factors on Checkpointing Frameworks: The CPPC Case Study
(Oxford University Press, 2011-11-01)
[Abstract] This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a tool for the checkpointing of parallel message-passing applications. Its performance and the factors that impact ...
ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems
(PLoS, 2018)
[Abstract]: Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, ...
Improving Scalability of Application-Level Checkpoint-Recovery by Reducing Checkpoint Sizes
(Springer Japan KK, 2013)
[Abstract] The execution times of large-scale parallel applications on nowadays multi/many-core systems are usually longer than the mean time between failures. Therefore, parallel applications must tolerate hardware failures ...
Parallelization of ARACNe, an Algorithm for the Reconstruction of Gene Regulatory Networks
(M D P I AG, 2019-07-31)
[Abstract] Gene regulatory networks are graphical representations of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression. There are different computational ...
Assessing resilient versus stop-and-restart fault-tolerant solutions in MPI applications
(Springer New York LLC, 2017-01)
[Abstract] The Message Passing Interface (MPI) standard is the most popular parallel programming model for distributed systems. However, it lacks fault-tolerance support and, traditionally, failures are addressed with ...
Reducing the overhead of an MPI application-level migration approach
(Elsevier BV * North-Holland, 2016)
[Abstract] Process migration provides many benefits for parallel environments including dynamic load balance, data access locality, or fault tolerance. This work proposes a solution that reduces the memory and I/O overhead ...
Local Rollback for Resilient Mpi Applications With Application-Level Checkpointing and Message Logging
(Elsevier BV * North-Holland, 2019-02)
[Abstract]
The resilience approach generally used in high-performance computing (HPC) relies on coordinated checkpoint/restart, a global rollback of all the processes that are running the application. However, in many ...
Implementing cloud-based parallel metaheuristics: an overview
(Universidad Nacional de la Plata - Facultad de Informatica, 2018-12-12)
[Abstract]
Metaheuristics are among the most popular methods for solving hard global optimization problems in many areas of science and engineering. Their parallel im- plementation applying HPC techniques is a common ...
MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems
(Oxford University Press, 2016)
[Abstracts] MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input ...