Bagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samples
| UDC.coleccion | Investigación | |
| UDC.departamento | Matemáticas | |
| UDC.endPage | 18 | |
| UDC.grupoInv | Modelización, Optimización e Inferencia Estatística (MODES) | |
| UDC.institutoCentro | CITIC - Centro de Investigación de Tecnoloxías da Información e da Comunicación | |
| UDC.issue | 108257 | |
| UDC.journalTitle | Computational Statistics & Data Analysis | |
| UDC.startPage | 1 | |
| UDC.volume | 213 | |
| dc.contributor.author | Barreiro-Ures, Daniel | |
| dc.contributor.author | Cao, Ricardo | |
| dc.contributor.author | Francisco-Fernández, Mario | |
| dc.contributor.author | Fernández-Casal, Rubén | |
| dc.date.accessioned | 2025-08-13T11:43:53Z | |
| dc.date.available | 2025-08-13T11:43:53Z | |
| dc.date.issued | 2026-01 | |
| dc.description.abstract | [Abstract]: Cross-validation is a well-known and widely used bandwidth selection method in nonparametric regression estimation. However, this technique has two remarkable drawbacks: the large variability of the selected bandwidths, and the inability to provide results in a reasonable time for very large sample sizes. To address these issues, bagged cross-validation bandwidth selectors are investigated. This approach consists in computing the cross-validation bandwidths for a finite number of subsamples and then rescaling the averaged smoothing parameters to the original sample size. Under a random-design regression model, asymptotic expressions up to a second-order for the bias and variance of the leave-one-out cross-validation bandwidth for the Nadaraya–Watson estimator are obtained. Subsequently, the asymptotic bias and variance and the limiting distribution for the bagged cross-validation selector are derived. Suitable choices of the number of subsamples and the subsample size lead to a convergence rate proportional to the inverse square root of the sample size for the bagging cross-validation selector, outperforming the slower rate typically associated with leave-one-out cross-validation. Several simulations and an illustration on a real dataset related to the COVID-19 pandemic show the behavior of our proposal and its better performance, in terms of statistical efficiency and computing time, when compared to leave-one-out cross-validation. | |
| dc.description.sponsorship | This work is part of the grants PID2020-113578RB-I00 and PID2023-147127OB-I00 “ERDF/EU”, funded by MCIN/AEI/10.13039/501100011033/. It has also been supported by the Xunta de Galicia (Grupos de Referencia Competitiva ED431C-2024/14) and by CITIC as a center accredited for excellence within the Galician University System and a member of the CIGUS Network, receives subsidies from the Department of Education, Science, Universities, and Vocational Training of the Xunta de Galicia. Additionally, it is co-financed by the EU through the FEDER Galicia 2021-27 operational program (Ref. ED431G 2023/01). Funding for open access charge: Universidade da Coruña/CISUG. | |
| dc.description.sponsorship | Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG | |
| dc.description.sponsorship | Xunta de Galicia; ED431C-2024/14 | |
| dc.description.sponsorship | Xunta de Galicia; ED431G 2023/01 | |
| dc.identifier.citation | Barreiro-Ures, D., Cao, R., Francisco-Fernández, M., & Casal, R. F. (2025). Bagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samples. Computational Statistics & Data Analysis, 108257. | |
| dc.identifier.doi | 10.1016/j.csda.2025.108257 | |
| dc.identifier.issn | 0167-9473 | |
| dc.identifier.issn | 1872-7352 | |
| dc.identifier.uri | https://hdl.handle.net/2183/45604 | |
| dc.language.iso | eng | |
| dc.publisher | Elsevier | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113578RB-I00/ES/METODOS ESTADISTICOS FLEXIBLES EN CIENCIA DE DATOS PARA DATOS COMPLEJOS Y DE GRAN VOLUMEN: TEORIA Y APLICACIONES/ | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica, Técnica y de Innovación 2021-2023/PID2023-147127OB-I00/ES/INFERENCIA ESTADISTICA UTILIZANDO METODOS FLEXIBLES PARA DATOS COMPLEJOS: TEORIA Y APPLICACIONES | |
| dc.relation.uri | https://doi.org/10.1016/j.csda.2025.108257 | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.accessRights | open access | |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Bagging | |
| dc.subject | Bandwidth selection | |
| dc.subject | Cross-validation | |
| dc.subject | Kernel smoothing | |
| dc.subject | Nadaraya–Watson | |
| dc.subject | Subsampling | |
| dc.title | Bagging cross-validated bandwidth selection in nonparametric regression estimation with applications to large-sized samples | |
| dc.type | journal article | |
| dc.type.hasVersion | VoR | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 5e21e4cc-372f-4718-8f5d-0024ba87a995 | |
| relation.isAuthorOfPublication | 3360aaca-39be-43b4-a458-974e79cdbc6b | |
| relation.isAuthorOfPublication | 9724fb7a-c0db-4b2f-aa1a-7f79bf9c2064 | |
| relation.isAuthorOfPublication | 96b3567f-5599-4789-bdfe-e621516d18ef | |
| relation.isAuthorOfPublication.latestForDiscovery | 5e21e4cc-372f-4718-8f5d-0024ba87a995 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Barreiro_Ures_Daniel_2026_Bagging_cross_validated_bandwidth_selection.pdf
- Size:
- 4.25 MB
- Format:
- Adobe Portable Document Format

