Serverless-like platform for container-based YARN clusters

Castellanos-Rodríguez, Ó., Expósito, R. R., Enes, J., Taboada, G. L., & Touriño, J. (2024). Serverless-like platform for container-based YARN clusters. Future Generation Computer Systems. Vol. 155, p. 256-271. https://doi.org/10.1016/j.future.2024.02.013

Abstract

[Abstract]: Serverless computing is an emerging paradigm that has gained a lot of relevance in recent years, as it allows users to consume computing resources without worrying about the underlying infrastructure and pay only for what they actually use. Most current services that implement this paradigm typically rely on the Function-as-a-Service (FaaS) model, which works perfectly for simple applications based on stateless functions triggered by specific events. However, these services are not designed to run more complex applications with intricate interactions, usually presenting a significant degree of configuration difficulty and/or low ability to customise the execution environment. They also tend to be designed for short and simple workloads, with some services even limiting their maximum runtime to just a few minutes. In this paper, we present a platform based on Hadoop YARN oriented to the execution of Big Data workloads in a containerised and serverless way, so that the resources allocated to such containers are automatically and dynamically scaled according to their actual usage. An experimental evaluation has been carried out to compare our serverless-like platform with a standard YARN deployment when executing Big Data workloads concurrently. Our results have shown experimental evidence of enhancing both performance and overall resource efficiency, providing runtime reductions and resource usage improvements of up to 41% and 50%, respectively.