Uso de algoritmos de aprendizaje máquina para la clasificación de tráfico de red

Varela Álvarez, Christian Manuel

dc.contributor.advisor	Fernández Iglesias, Diego
dc.contributor.advisor	Nóvoa Manuel, Francisco Javier
dc.contributor.author	Varela Álvarez, Christian Manuel
dc.contributor.other	Enxeñaría informática, Grao en	es_ES
dc.date.accessioned	2021-01-13T17:51:53Z
dc.date.available	2021-01-13T17:51:53Z
dc.date.issued	2020-09
dc.identifier.uri	http://hdl.handle.net/2183/27116
dc.description.abstract	[Resumen] La cuarentena ocasionada por la pandemia mundial a causa del virus COVID-19 ha mostrado el camino que está tomando la sociedad en la que vivimos, una sociedad donde la filosofía del siempre conectado toma más fuerza que nunca. Muestra de esto ha sido el crecimiento del teletrabajo, provocando el aumento del tráfico de red a unas velocidades vertiginosas, y con ello el número de ataques y de nuevas amenazas en el mundo cibernético. Ante este escenario surge la necesidad de mejorar y actualizar los planes de defensa. Para poder analizar la gran cantidad de tráfico de red que se genera, aparece la estrategia de agregación de flujos, permitiendo agrupar el tráfico en una serie de paquetes que comparten unos valores concretos y así poder reducir la cantidad de datos a analizar mientras se conserva toda la información necesaria para dicha tarea. Aún así, esta agregación no consigue reducir lo suficiente las cantidades de datos, por lo que es aquí donde entra en juego el Big Data, que con el uso de robustas herramientas de Machine Learning junto con los sistemas distribuidos, permiten acometer la tarea de clasificación de tráfico de red de forma sencilla, eficiente y escalable. Este es el tema que aborda este proyecto, donde aplicamos minería de datos sobre un conjunto de flujos de red para que mediante la selección de tres algoritmos de clasificación de aprendizaje máquina poder crear tres modelos que son capaces de predecir si un flujo es tráfico normal o de ataque. Para esto, seguimos las fases marcadas por la metodología CRISP-DM. Finalmente, cada uno de estos modelos los desplegamos de forma distribuida para poder ver la importancia que tienen los sistemas distribuidos para el análisis en tiempo real del tráfico de una red.	es_ES
dc.description.abstract	[Abstract] The quarantine caused by the global pandemic due to the COVID-19 virus has shown the path that the society we live in is taking, a society where the philosophy of the always connected is taking on more strength than ever. An example of this has been the growth of teleworking, causing network traffic to increase at dizzying speeds, and with it the number of attacks and new threats in the cyber world. In view of this scenario, the need to improve and update defence plans arises. In order to analyse the large amount of network traffic generated, a strategy of flow aggregation appears, allowing traffic to be grouped into a series of packets that share specific values and thus reducing the amount of data to be analysed while conserving all the information necessary for this task. Even so, this aggregation does not manage to reduce the amounts of data sufficiently, so this is where Big Data comes into play. With the use of robust Machine Learning tools together with distributed systems, this allows the task of classifying network traffic to be undertaken in a simple, efficient and scalable manner. This is the subject that we address per project, where we apply data mining on a set of network flows so that by selecting three machine learning classification algorithms we can create three models that are able to predict whether a flow is normal or attack traffic. For this, we follow the phases marked by the CRISP-DM methodology. Finally, each one of these models is deployed in a distributed way in order to see the importance of distributed systems for the real-time analysis of network traffic.	es_ES
dc.language.iso	spa	es_ES
dc.rights	Atribución-NoComercial-SinDerivadas 3.0 España	es_ES
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/es/	*
dc.subject	Flujo	es_ES
dc.subject	Anomalía	es_ES
dc.subject	Sistema distribuido	es_ES
dc.subject	Clasificación	es_ES
dc.subject	Regresión logística	es_ES
dc.subject	Machine learning	es_ES
dc.subject	Big Data	es_ES
dc.subject	Flow	es_ES
dc.subject	Anomaly	es_ES
dc.subject	Distributed system	es_ES
dc.subject	Classification	es_ES
dc.subject	Cluster	es_ES
dc.subject	Logistic regression	es_ES
dc.subject	Random forest	es_ES
dc.subject	Naive bayes	es_ES
dc.title	Uso de algoritmos de aprendizaje máquina para la clasificación de tráfico de red	es_ES
dc.type	info:eu-repo/semantics/bachelorThesis	es_ES
dc.rights.access	info:eu-repo/semantics/openAccess	es_ES
dc.description.traballos	Traballo fin de grao (UDC.FIC). Enxeñaría informática. Curso 2019/2020	es_ES

Ficheiros no ítem

Nome:: license_rdf
Tamaño:: 1.203Kb
Formato:: application/rdf+xml

Ver/abrir

Nome:: C.M.Varela_Álvarez_2020_Uso_de ...
Tamaño:: 4.604Mb
Formato:: PDF

Ver/abrir

Este ítem aparece na(s) seguinte(s) colección(s)

Enxeñaría informática, Grao en [447]

Mostrar o rexistro simple do ítem