Use this link to cite:
https://hdl.handle.net/2183/47590 A Hybrid Metaheuristics-Bayesian Optimization Framework with Safe Transfer Learning for Continuous Spark Tuning
Loading...
Identifiers
Publication date
Authors
Advisors
Other responsabilities
Journal Title
Bibliographic citation
M. Garralda-Barrio, C. Eiras-Franco, and V. Bolón-Canedo, "A hybrid metaheuristics-Bayesian optimization framework with safe transfer learning for continuous spark tuning", Future Generation Computer Systems, Vol. 178, May 2026, 108325, https://doi.org/10.1016/j.future.2025.108325
Type of academic work
Academic degree
Abstract
[Abstract]: Tuning configuration parameters in distributed Big Data engines such as Apache Spark is a high-dimensional, workload-dependent problem with significant impact on performance and operational cost. We address this challenge with a hybrid optimization framework that integrates Iterated Local Search, Tabu Search, and locally embedded Bayesian Optimization guided by STL-PARN (safe transfer learning with pattern-adaptive robust neighborhoods). Historical executions are partitioned into a Nucleus of reliable neighbors and a Corona of exploratory configurations, ensuring relevance while mitigating negative transfer. The surrogate within the embedded Bayesian Optimization stage decouples performance prediction from uncertainty modeling, enabling parameter-free acquisition functions that self-adapt to diverse workloads. Experiments on a modernized HiBench suite across multiple input scales show consistent gains over state-of-the-art baselines in execution time, convergence, and cost efficiency. Overall, the results demonstrate the robustness and practical value of embedding Bayesian Optimization within a global metaheuristic loop for adaptive, cost-aware Spark tuning. All source code and datasets are publicly available, supporting reproducibility and operational efficiency in large-scale data processing.
Description
The source code and experimental resources supporting this study are openly available at Github (https://github.com/mgarralda/garralda-performance-model). They are released for academic and research use under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Editor version
Rights
Attribution-NonCommercial 4.0 International








