Garralda-Barrio, MarianoEiras-Franco, CarlosBolón-Canedo, Verónica2026-03-052026-03-052025-12M. Garralda-Barrio, C. Eiras-Franco, and V. Bolón-Canedo, "A hybrid metaheuristics-Bayesian optimization framework with safe transfer learning for continuous spark tuning", Future Generation Computer Systems, Vol. 178, May 2026, 108325, https://doi.org/10.1016/j.future.2025.1083251872-7115https://hdl.handle.net/2183/47590The source code and experimental resources supporting this study are openly available at Github (https://github.com/mgarralda/garralda-performance-model). They are released for academic and research use under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).[Abstract]: Tuning configuration parameters in distributed Big Data engines such as Apache Spark is a high-dimensional, workload-dependent problem with significant impact on performance and operational cost. We address this challenge with a hybrid optimization framework that integrates Iterated Local Search, Tabu Search, and locally embedded Bayesian Optimization guided by STL-PARN (safe transfer learning with pattern-adaptive robust neighborhoods). Historical executions are partitioned into a Nucleus of reliable neighbors and a Corona of exploratory configurations, ensuring relevance while mitigating negative transfer. The surrogate within the embedded Bayesian Optimization stage decouples performance prediction from uncertainty modeling, enabling parameter-free acquisition functions that self-adapt to diverse workloads. Experiments on a modernized HiBench suite across multiple input scales show consistent gains over state-of-the-art baselines in execution time, convergence, and cost efficiency. Overall, the results demonstrate the robustness and practical value of embedding Bayesian Optimization within a global metaheuristic loop for adaptive, cost-aware Spark tuning. All source code and datasets are publicly available, supporting reproducibility and operational efficiency in large-scale data processing.engAttribution-NonCommercial 4.0 Internationalhttp://creativecommons.org/licenses/by-nc/4.0/Spark continuous tuningBayesian optimizationSafe transfer learningMetaheuristicsBig dataA Hybrid Metaheuristics-Bayesian Optimization Framework with Safe Transfer Learning for Continuous Spark Tuningjournal articleopen access10.1016/j.future.2025.108325