How Important Is Data Quality? Best Classifiers vs Best Features
| UDC.coleccion | Investigación | es_ES |
| UDC.departamento | Ciencias da Computación e Tecnoloxías da Información | es_ES |
| UDC.grupoInv | Laboratorio de Investigación e Desenvolvemento en Intelixencia Artificial (LIDIA) | es_ES |
| UDC.issue | 365 | es_ES |
| UDC.journalTitle | Neurocomputing | es_ES |
| UDC.startPage | 375 | es_ES |
| UDC.volume | 470 | es_ES |
| dc.contributor.author | Morán-Fernández, Laura | |
| dc.contributor.author | Bolón-Canedo, Verónica | |
| dc.contributor.author | Alonso-Betanzos, Amparo | |
| dc.date.accessioned | 2022-03-17T18:21:02Z | |
| dc.date.available | 2022-03-17T18:21:02Z | |
| dc.date.issued | 2021 | |
| dc.description | Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG | es_ES |
| dc.description.abstract | [Abstract] The task of choosing the appropriate classifier for a given scenario is not an easy-to-solve question. First, there is an increasingly high number of algorithms available belonging to different families. And also there is a lack of methodologies that can help on recommending in advance a given family of algorithms for a certain type of datasets. Besides, most of these classification algorithms exhibit a degradation in the performance when faced with datasets containing irrelevant and/or redundant features. In this work we analyze the impact of feature selection in classification over several synthetic and real datasets. The experimental results obtained show that the significance of selecting a classifier decreases after applying an appropriate preprocessing step and, not only this alleviates the choice, but it also improves the results in almost all the datasets tested. | es_ES |
| dc.description.sponsorship | This work has been supported by the National Plan for Scientific and Technical Research and Innovation of the Spanish Government (Grant PID2019-109238 GB-C2), and by the Xunta de Galicia (Grant ED431C 2018/34) with the European Union ERDF funds. CITIC, as Research Center accredited by Galician University System, is funded by “Consellería de Cultura, Educación e Universidades from Xunta de Galicia”, supported in an 80% through ERDF Funds, ERDF Operational Programme Galicia 2014–2020, and the remaining 20% by “Secretaría Xeral de Universidades” (Grant ED431G 2019/01). Funding for open access charge: Universidade da Coruña/CISUG | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431C 2018/34 | es_ES |
| dc.description.sponsorship | Xunta de Galicia; ED431G 2019/01 | es_ES |
| dc.identifier.citation | MORÁN-FERNÁNDEZ, Laura, BÓLON-CANEDO, Verónica and ALONSO-BETANZOS, Amparo, 2022. How important is data quality? Best classifiers vs best features. Neurocomputing. 22 January 2022. Vol. 470, p. 365–375. DOI 10.1016/j.neucom.2021.05.107 | es_ES |
| dc.identifier.doi | 10.1016/j.neucom.2021.05.107 | |
| dc.identifier.uri | http://hdl.handle.net/2183/30051 | |
| dc.language.iso | eng | es_ES |
| dc.publisher | Elsevier | es_ES |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-109238GB-C21/ES/SISTEMAS DE RECOMENDACION EXPLICABLES/ | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-109238GB-C22/ES/APRENDIZAJE AUTOMATICO ESCALABLE Y EXPLICABLE/ | |
| dc.relation.uri | http://dx.doi.org/10.1016/j.neucom.2021.05.107 | es_ES |
| dc.rights | Atribución-NoComercial-SinDerivadas 4.0 Internacional | es_ES |
| dc.rights.accessRights | open access | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
| dc.subject | Feature selection | es_ES |
| dc.subject | Filters | es_ES |
| dc.subject | Preprocessing | es_ES |
| dc.subject | High dimensionality | es_ES |
| dc.subject | Classification | es_ES |
| dc.subject | Data análisis | es_ES |
| dc.title | How Important Is Data Quality? Best Classifiers vs Best Features | es_ES |
| dc.type | journal article | es_ES |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | dfd64126-0d31-4365-b205-4d44ed5fa9c0 | |
| relation.isAuthorOfPublication | c114dccd-76e4-4959-ba6b-7c7c055289b1 | |
| relation.isAuthorOfPublication | a89f1cad-dbc5-471f-986a-26c021ed4a95 | |
| relation.isAuthorOfPublication.latestForDiscovery | dfd64126-0d31-4365-b205-4d44ed5fa9c0 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Moran_Fernandez_Laura_2021_How_Important_Is_Data_Quality.pdf
- Size:
- 674.49 KB
- Format:
- Adobe Portable Document Format
- Description:

