Expósito, Roberto R.Taboada, Guillermo L.Ramos Garea, SabelaTouriño, JuanDoallo, Ramón2018-07-042018-07-042016Expósito, R. R., Taboada, G. L., Ramos, S., Touriño, J., & Doallo, R. (2016). Performance evaluation of data-intensive computing applications on a public IaaS cloud. The Computer Journal, 59(3), 287-307.0010-46201460-2067http://hdl.handle.net/2183/20849This is a post-peer-review, pre-copyedit version of an article published in The Computer Journal. The final authenticated version is available online at: https://doi.org/10.1093/comjnl/bxu111[Abstract] The advent of cloud computing technologies, which dynamically provide on-demand access to computational resources over the Internet, is offering new possibilities to many scientists and researchers. Nowadays, Infrastructure as a Service (IaaS) cloud providers can offset the increasing processing requirements of data-intensive computing applications, becoming an emerging alternative to traditional servers and clusters. In this paper, a comprehensive study of the leading public IaaS cloud platform, Amazon EC2, has been conducted in order to assess its suitability for data-intensive computing. One of the key contributions of this work is the analysis of the storage-optimized family of EC2 instances. Furthermore, this study presents a detailed analysis of both performance and cost metrics. More specifically, multiple experiments have been carried out to analyze the full I/O software stack, ranging from the low-level storage devices and cluster file systems up to real-world applications using representative data-intensive parallel codes and MapReduce-based workloads. The analysis of the experimental results has shown that data-intensive applications can benefit from tailored EC2-based virtual clusters, enabling users to obtain the highest performance and cost-effectiveness in the cloud.engData intensive computingCloud computingInfrastructure as a serviceAmazon EC2Cluster file systemMapReducePerformance Evaluation of Data-Intensive Computing Applications on a Public IaaS Cloudjournal articleopen access10.1093/comjnl/bxu111