Using the Cloud for Parameter Estimation Problems: Comparing Spark vs MPI with a Case-Study

González, Patricia; Pardo, Xoán C.; Penas, David R.; Teijeiro, Diego; Banga, Julio R.; Doallo, Ramón

Título

Autor(es)

Fecha

2017-07-13

Cita bibliográfica

P. González, X. C. Pardo, D. R. Penas, D. Teijeiro, J. R. Banga and R. Doallo, "Using the Cloud for Parameter Estimation Problems: Comparing Spark vs MPI with a Case-Study," 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Madrid, 2017, pp. 797-806, doi: 10.1109/CCGRID.2017.58.

Resumen

[Abstract] Systems biology is an emerging approach focused in generating new knowledge about complex biological systems by combining experimental data with mathematical modeling and advanced computational techniques. Many problems in this field are extremely challenging and require substantial supercomputing resources to be solved. This is the case of parameter estimation in large-scale nonlinear dynamic systems biology models. Recently, Cloud Computing has emerged as a new paradigm for on-demand delivery of computing resources. However, scientific computing community has been quite hesitant in using the Cloud, simply because traditional programming models do not fit well with the new paradigm, and the earliest cloud programming models do not allow most scientific computations being efficiently run in the Cloud. In this paper we explore and compare two distributed computing models: the MPI (message-passing interface) model, that is high-performance oriented, and the Spark model, which is throughput oriented but outperforms other cloud programming solutions adding improved support for iterative algorithms through in-memory computing. The performance of a very well known metaheuristic, the Differential Evolution algorithm, has been thoroughly assessed using a challenging parameter estimation problem from the domain of computational systems biology. The experiments have been carried out both in a local cluster and in the Microsoft Azure public cloud, allowing performance and cost evaluation for both infrastructures.

Palabras clave

Cloud computing
Computational modeling
Sparks
Programming
Sociology
Statistics
Data models

Descripción

Date of Conference: 14-17 May 2017. Conference Location: Madrid

Versión del editor

https://doi.org/10.1109/CCGRID.2017.58

Derechos

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.