How Important Is Data Quality? Best Classifiers vs Best Features

Use este enlace para citar
http://hdl.handle.net/2183/30051
Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución-NoComercial-SinDerivadas 4.0 Internacional
Colecciones
- Investigación (FIC) [1685]
Metadatos
Mostrar el registro completo del ítemTítulo
How Important Is Data Quality? Best Classifiers vs Best FeaturesFecha
2021Cita bibliográfica
MORÁN-FERNÁNDEZ, Laura, BÓLON-CANEDO, Verónica and ALONSO-BETANZOS, Amparo, 2022. How important is data quality? Best classifiers vs best features. Neurocomputing. 22 January 2022. Vol. 470, p. 365–375. DOI 10.1016/j.neucom.2021.05.107
Resumen
[Abstract] The task of choosing the appropriate classifier for a given scenario is not an easy-to-solve question. First, there is an increasingly high number of algorithms available belonging to different families. And also there is a lack of methodologies that can help on recommending in advance a given family of algorithms for a certain type of datasets. Besides, most of these classification algorithms exhibit a degradation in the performance when faced with datasets containing irrelevant and/or redundant features. In this work we analyze the impact of feature selection in classification over several synthetic and real datasets. The experimental results obtained show that the significance of selecting a classifier decreases after applying an appropriate preprocessing step and, not only this alleviates the choice, but it also improves the results in almost all the datasets tested.
Palabras clave
Feature selection
Filters
Preprocessing
High dimensionality
Classification
Data análisis
Filters
Preprocessing
High dimensionality
Classification
Data análisis
Descripción
Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG
Versión del editor
Derechos
Atribución-NoComercial-SinDerivadas 4.0 Internacional