Barreiro-Ures, DanielCao, RicardoFrancisco-Fernández, MarioFernández-Casal, Rubén2025-08-132025-08-132026-01Barreiro-Ures, D., Cao, R., Francisco-Fernández, M., & Casal, R. F. (2025). Bagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samples. Computational Statistics & Data Analysis, 108257.0167-94731872-7352https://hdl.handle.net/2183/45604[Abstract]: Cross-validation is a well-known and widely used bandwidth selection method in nonparametric regression estimation. However, this technique has two remarkable drawbacks: the large variability of the selected bandwidths, and the inability to provide results in a reasonable time for very large sample sizes. To address these issues, bagged cross-validation bandwidth selectors are investigated. This approach consists in computing the cross-validation bandwidths for a finite number of subsamples and then rescaling the averaged smoothing parameters to the original sample size. Under a random-design regression model, asymptotic expressions up to a second-order for the bias and variance of the leave-one-out cross-validation bandwidth for the Nadaraya–Watson estimator are obtained. Subsequently, the asymptotic bias and variance and the limiting distribution for the bagged cross-validation selector are derived. Suitable choices of the number of subsamples and the subsample size lead to a convergence rate proportional to the inverse square root of the sample size for the bagging cross-validation selector, outperforming the slower rate typically associated with leave-one-out cross-validation. Several simulations and an illustration on a real dataset related to the COVID-19 pandemic show the behavior of our proposal and its better performance, in terms of statistical efficiency and computing time, when compared to leave-one-out cross-validation.engAttribution-NonCommercial-NoDerivatives 4.0 Internationalhttp://creativecommons.org/licenses/by-nc-nd/4.0/BaggingBandwidth selectionCross-validationKernel smoothingNadaraya–WatsonSubsamplingBagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samplesjournal articleopen access10.1016/j.csda.2025.108257