Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform

Use this link to cite
http://hdl.handle.net/2183/20868Collections
- Investigación (FIC) [1635]
Metadata
Show full item recordTitle
Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O PlatformAuthor(s)
Date
2013-12Citation
Expósito, R. R., Taboada, G. L., Ramos, S., González-Domínguez, J., Touriño, J., & Doallo, R. (2013). Analysis of I/O performance on an Amazon EC2 cluster compute and high I/O platform. Journal of grid computing, 11(4), 613-631.
Abstract
[Abstract] Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem using the Amazon EC2 Cluster Compute platform and the recent High I/O instance type, to determine its suitability for I/O-intensive applications. The evaluation has been carried out at different layers using representative benchmarks in order to evaluate the low-level cloud storage devices available in Amazon EC2, ephemeral disks and Elastic Block Store (EBS) volumes, both on local and distributed file systems. In addition, several I/O interfaces (POSIX, MPI-IO and HDF5) commonly used by scientific workloads have also been assessed. Furthermore, the scalability of a representative parallel I/O code has also been analyzed at the application level, taking into account both performance and cost metrics. The analysis of the experimental results has shown that available cloud storage devices can have different performance characteristics and usage constraints. Our comprehensive evaluation can help scientists to increase significantly (up to several times) the performance of I/O-intensive applications in Amazon EC2 cloud. An example of optimal configuration that can maximize I/O performance in this cloud is the use of a RAID 0 of 2 ephemeral disks, TCP with 9,000 bytes MTU, NFS async and MPI-IO on the High I/O instance type, which provides ephemeral disks backed by Solid State Drive (SSD) technology.
Keywords
Cloud computing
Virtualization
I/O performance evaluation
Network File System (NFS)
MPI-IO
Solid State Drive (SSD)
Virtualization
I/O performance evaluation
Network File System (NFS)
MPI-IO
Solid State Drive (SSD)
Description
“This is a post-peer-review, pre-copyedit version of an article published in Journal of Grid Computing. The final authenticated version is available online at: https://doi.org/10.1007/s10723-013-9250-y
Editor version
ISSN
1570-7873
1572-9184
1572-9184