Population Subset Selection for the Use of a Validation Dataset for Overfitting Control in Genetic Programming
| UDC.coleccion | Investigación | es_ES |
| UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | es_ES |
| UDC.endPage | 271 | es_ES |
| UDC.grupoInv | Redes de Neuronas Artificiais e Sistemas Adaptativos -Informática Médica e Diagnóstico Radiolóxico (RNASA - IMEDIR) | es_ES |
| UDC.grupoInv | RNASA - IMEDIR (INIBIC) | es_ES |
| UDC.institutoCentro | INIBIC - Instituto de Investigacións Biomédicas de A Coruña | es_ES |
| UDC.issue | 2 | es_ES |
| UDC.journalTitle | Journal of Experimental & Theoretical Artificial Intelligence | es_ES |
| UDC.startPage | 243 | es_ES |
| UDC.volume | 32 | es_ES |
| dc.contributor.author | Rivero, Daniel | |
| dc.contributor.author | Fernández-Blanco, Enrique | |
| dc.contributor.author | Fernández-Lozano, Carlos | |
| dc.contributor.author | Pazos, A. | |
| dc.date.accessioned | 2020-09-16T09:39:48Z | |
| dc.date.available | 2020-09-16T09:39:48Z | |
| dc.date.issued | 2019-07-31 | |
| dc.description.abstract | [Abstract] Genetic Programming (GP) is a technique which is able to solve different problems through the evolution of mathematical expressions. However, in order to be applied, its tendency to overfit the data is one of its main issues. The use of a validation dataset is a common alternative to prevent overfitting in many Machine Learning (ML) techniques, including GP. But, there is one key point which differentiates GP and other ML techniques: instead of training a single model, GP evolves a population of models. Therefore, the use of the validation dataset has several possibilities because any of those evolved models could be evaluated. This work explores the possibility of using the validation dataset not only on the training-best individual but also in a subset with the training-best individuals of the population. The study has been conducted with 5 well-known databases performing regression or classification tasks. In most of the cases, the results of the study point out to an improvement when the validation dataset is used on a subset of the population instead of only on the training-best individual, which also induces a reduction on the number of nodes and, consequently, a lower complexity on the expressions. | |
| dc.description.sponsorship | Xunta de Galicia; ED431G/01 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431D 2017/16 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431C 2018/49 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431D 2017/23 | es_ES |
| dc.description.sponsorship | Instituto de Salud Carlos III; PI17/01826 | es_ES |
| dc.identifier.citation | Rivero D, Fernandez-Blanco E, Fernandez-Lozano C, Pazos A. Population subset selection for the use of a validation dataset for overfitting control in genetic programming. J Exp Theor Artif Intell. 2020; 32(2):243-271 | es_ES |
| dc.identifier.issn | 0952-813X | |
| dc.identifier.uri | http://hdl.handle.net/2183/26190 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | Taylor & Francis Group | es_ES |
| dc.relation.uri | https://doi.org/10.1080/0952813X.2019.1647562 | es_ES |
| dc.rights | This is an accepted manuscript of an articled published by Taylor & Francis in "Journal of Experimental & Theoretical Artificial Intelligence", avaliable at Taylor & Francis Online | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.subject | Genetic programming | es_ES |
| dc.subject | Overfitting | es_ES |
| dc.subject | Validation | es_ES |
| dc.subject | Evolutionary computation | es_ES |
| dc.title | Population Subset Selection for the Use of a Validation Dataset for Overfitting Control in Genetic Programming | es_ES |
| dc.type | journal article | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | d8e10433-ea19-4a35-8cc6-0c7b9f143a6d | |
| relation.isAuthorOfPublication | 244a6828-de1c-45f3-86b6-69bb81250814 | |
| relation.isAuthorOfPublication | e5ddd06a-3e7f-4bf4-9f37-5f1cf3d3430a | |
| relation.isAuthorOfPublication | fa192a4c-bffd-4b23-87ae-e68c29350cdc | |
| relation.isAuthorOfPublication.latestForDiscovery | d8e10433-ea19-4a35-8cc6-0c7b9f143a6d |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Pazos_2019_Population_subset_selection.pdf
- Size:
- 843.23 KB
- Format:
- Adobe Portable Document Format
- Description:

