Uso de algoritmos de aprendizaje máquina para la clasificación de tráfico de red

Costa Garrido, Anxo

dc.contributor.advisor	Fernández, Diego
dc.contributor.advisor	Novoa, Francisco
dc.contributor.author	Costa Garrido, Anxo
dc.contributor.other	Universidade da Coruña. Facultade de Informática	es_ES
dc.date.accessioned	2022-11-15T09:27:25Z
dc.date.available	2022-11-15T09:27:25Z
dc.date.issued	2022
dc.identifier.uri	http://hdl.handle.net/2183/32037
dc.description.abstract	[Resumen] Los avances tecnológicos han permitido un acceso a internet asequible, rápido y fiable aumentando el número de usuarios y los servicios demandados. Han surgido nuevos paradigmas de diseño para simplificar la administración de unas redes cada vez más complejas, añadiendo nuevas superficies de ataque. Las restricciones debidas al COVID-19 obligaron a crear unas infraestructuras de teletrabajo de la noche a la mañana, dándole un mayor papel al usuario y la seguridad de su entorno. Cada vez hay más información sensible en contacto con internet, y por eso la ciberseguridad es más importante que nunca. La clasificación del tráfico de red es una herramienta útil para tareas de seguridad, pero analizar cada paquete conlleva un elevado coste computacional. De ahí que a menudo este análisis se realice a nivel de flujo de red. En este trabajo hemos aplicado diversos modelos de Machine Learning y Deep Learning a un conjunto de datos etiquetado (InSDN ) que contiene las características del tráfico (en flujos) de una red definida por software, con el objetivo de realizar una tarea de clasificación supervisada, siguiendo la metodología CRISP-DM, nuestra guía durante todo el proceso. A lo largo de este trabajo se ha realizado una intensiva labor de ingeniería de datos. Se ha analizado de manera exhaustiva el conjunto de datos inicial, se ha hecho una limpieza de los datos, se les ha dado el formato adecuado, se han construido nuevas características derivadas de las ya existentes, y se han seleccionado las que aportaban más información. Sobre el conjunto obtenido se han aplicado algoritmos de diferente naturaleza, tras realizar un proceso de hiperparametrización. Para su implementación se han usado principalmente las herramientas scikit-learn y Keras. Para finalizar, los modelos resultantes han sido evaluados empleando métricas de clasificación tradicionales. Los resultados muestran que los modelos de Machine Learning y Deep Learning resultan de utilidad en problemas de clasificación de tráfico de red, destacando los modelos Random Forest y LinearSVC por su exactitud y rapidez.	es_ES
dc.description.abstract	[Abstract] Technological advances have led to affordable, fast and reliable Internet access, increasing the number of users and the services demanded. New design paradigms have emerged to simplify the management of increasingly complex networks, adding new attack surfaces. COVID-19 restrictions forced the creation of remote work infrastructures overnight, emphasizing user’s role on the overall security. With more and more sensitive information coming into contact with the Internet, cybersecurity is more important than ever. Network traffic classification is a useful tool for security tasks, but analyzing each packet is computationally expensive. Thus, this analysis is often performed on network flows. In this paper we have applied several models of Machine Learning and Deep Learning to a labeled dataset (InSDN ) containing the traffic characteristics (in flows) of a softwaredefined network, with the objective of performing a supervised classification task, following the CRISP-DM methodology, our guide during the whole process. Throughout this work, we have performed an intensive data engineering task. We have exhaustively analyzed the initial dataset, cleaned and formatted the data, constructed new features from existing ones, selecting the ones providing the most information. We have applied algorithms of different nature on the obtained set following a hyperparameterization process, using scikit-learn and Keras. Finally we have evaluated the resulting models using traditional classification metrics. The results show that the Machine Learning and Deep Learning models are useful in network traffic classification problems, highlighting the Random Forest and LinearSVC for their accuracy and speed.	es_ES
dc.language.iso	spa	es_ES
dc.rights	Todos os dereitos reservados	es_ES
dc.subject	Machine learning	es_ES
dc.subject	Deep learning	es_ES
dc.subject	Detección de intrusiones	es_ES
dc.subject	Redes sefinidas por software	es_ES
dc.subject	Clasificación	es_ES
dc.subject	Random forest	es_ES
dc.subject	AdaBoost	es_ES
dc.subject	LinearSVC	es_ES
dc.subject	Autoencoder	es_ES
dc.subject	RNN-LSTM	es_ES
dc.subject	CNN	es_ES
dc.subject	Intrusion detection	es_ES
dc.subject	Software defined networks	es_ES
dc.subject	Classification	es_ES
dc.subject	Random forest	es_ES
dc.title	Uso de algoritmos de aprendizaje máquina para la clasificación de tráfico de red	es_ES
dc.type	info:eu-repo/semantics/bachelorThesis	es_ES
dc.rights.access	info:eu-repo/semantics/openAccess	es_ES
dc.description.traballos	Traballo fin de grao (UDC.FIC). Enxeñaría Informática. Curso 2021/2022	es_ES

Ficheiros no ítem

Nome:: CostaGarrido_Anxo_TFG_2022.pdf
Tamaño:: 3.398Mb
Formato:: PDF

Ver/abrir

Este ítem aparece na(s) seguinte(s) colección(s)

Enxeñaría informática, Grao en [447]

Mostrar o rexistro simple do ítem