Buscar
Mostrando ítems 1-3 de 3
Resilient MPI applications using an application-level checkpointing framework and ULFM
(Springer New York LLC, 2017-01)
[Abstract] Future exascale systems, formed by millions of cores, will present high failure rates, and long-running applications will need to make use of new fault tolerance techniques to ensure successful execution completion. ...
Assessing resilient versus stop-and-restart fault-tolerant solutions in MPI applications
(Springer New York LLC, 2017-01)
[Abstract] The Message Passing Interface (MPI) standard is the most popular parallel programming model for distributed systems. However, it lacks fault-tolerance support and, traditionally, failures are addressed with ...
A Portable and Adaptable Fault Tolerance Solution for Heterogeneous Applications
(Academic Press, 2017-06)
[Abstract] Heterogeneous systems have increased their popularity in recent years due to the high performance and reduced energy consumption capabilities provided by using devices such as GPUs or Xeon Phi accelerators. This ...