Sistema de detección de anomalías basado en el análisis de flujos de red mediante técnicas de aprendizaje automático

Padín Torrente, Héctor

Use this link to cite:

http://hdl.handle.net/2183/31934

Sistema de detección de anomalías basado en el análisis de flujos de red mediante técnicas de aprendizaje automático

Files

PadinTorrente_Hector_TFG_2022.pdf (5.2 MB)

Identifiers

URI: http://hdl.handle.net/2183/31934

Publication date

2022

Authors

Padín Torrente, Héctor

Advisors

Nóvoa, Francisco

Dafonte, Carlos

Other responsabilities

Universidade da Coruña. Facultade de Informática

Type of academic work

TFG

Academic degree

Grao en Enxeñaría Informática

Abstract

[Resumen] Los rápidos avances en el campo de Internet y las comunicaciones han dado lugar a un enorme aumento del número de dispositivos conectados y el volumen de tráfico que discurre por las redes. Con una sociedad cada vez más ’digitalizada’, las organizaciones se ven obligadas a depender cada vez más de servicios de TI en general y de las redes de comunicaciones en particular. Además, ciertos avances como el cloud computing han propiciado que estas últimas reemplacen modelos tradicionales centrados en la seguridad perimetral por modelos más abiertos y expuestos. Y si bien estos avances han permitido, entre otras cosas, mejorar o mantener los niveles de productividad de muchos sectores, también han propiciado que el valor de la información y disponibilidad de los sistemas sea cada día mayor. Es por ello que se cada año se puede observar de manera más sustancial el aumento tanto en el número de ciberataques, como en su complejidad. Una de las primeras líneas de defensa de las organizaciones para detectar las posibles intrusiones en sus redes son los NIDS (Network Intrusion Detection Systems). En un intento de detectar ataques o comportamientos inusuales, estos sistemas tradicionalmente, inspeccionaban el contenido de cada paquete. Sin embargo, debido al gran aumento de tráfico en las redes de comunicaciones junto con el incremento y evolución de los ciberataques han levado a la introducción de estrategias más avanzadas como la detección de anomalías mediante el paradigma de inspección basado en flujos. La detección de anomalías consiste en modelizar el comportamiento normal de la red y detectar las desviaciones o comportamientos inusuales dentro de esta. Para generar una línea de base que refleje el comportamiento normal de la red de manera eficiente se utilizan técnicas de aprendizaje automático (ML) y aprendizaje profundo (DL). De esta manera, cuando se detectan patrones de tráfico anormales o actividades de red irregulares, estas técnicas alertan al equipo de seguridad sobre la posible amenaza. En este proyecto se presentan diferentes técnicas para la detección de anomalías en flujos de red. Para ello se utilizó un dataset etiquetado, el CSE-CIC-IDS-2018, sobre el cual se realizaron diferentes fases de exploración y preprocesado para facilitar el aprendizaje. Las técnicas utilizadas pueden agruparse según la metodología de inspección para los NIDS. Las basadas en anomalías, Isolation Forest y Autoencoder, y las basadas en firmas, Random Forest y Deep Neural Network. Por último, se realizó una combinación de ambas metodologías para el desarrollo de un clasificador híbrido utilizando el Autoencoder y el Random Forest. De los resultados obtenidos por las técnicas se puede concluir que tanto la detección de anomalías como la detección de firmas no son técnicas que realmente se puedan implementar en una organización por sí solas. Sin embargo, la combinación de estas técnicas es capaz de aprovechar las ventajas de ambas para llevar a cabo una detección eficaz y eficiente de los diferentes ciberataques.
[Abstract] The rapid advances in the field of Internet and communications have resulted in a huge increase in the number of connected devices and the volume of traffic flowing over networks. With an increasingly ’digitized’ society, organizations are forced to rely more and more on IT services in general and communications networks in particular. In addition, advances such as cloud computing have led organizations to replace traditional models focused on perimeter security with more open and exposed models. And while these advances have made it possible, among other things, to improve or maintain productivity levels in many sectors, they have also increased the value of information and the availability of systems. This is why every year we see a substantial increase in the number and complexity of cyber-attacks. One of the first lines of defense for organizations to detect possible intrusions in their networks are the NIDS (Network Intrusion Detection Systems). In an attempt to detect attacks or unusual behavior, these systems traditionally inspected the contents of each packet. However, due to the large increase of traffic in communication networks along with the increase and evolution of cyber-attacks have led to the introduction of more advanced strategies such as anomaly detection using the flow-based inspection paradigm. Anomaly detection consists of modeling normal network behavior and detecting deviations or unusual behavior within the network. Machine learning (ML) and deep learning (DL) techniques are used to generate a baseline that reflects the normal behavior of the network in an efficient manner. Thus, when abnormal traffic patterns or irregular network activities are detected, these techniques alert the security team about the possible threat. In this project, different techniques for detecting anomalies in network flows are presented. For this purpose, a labeled dataset, the CSE-CIC-IDS-2018, was used, on which different exploration and preprocessing phases were performed to facilitate learning. The techniques used can be grouped according to the inspection methodology for the NIDS. Those based on anomalies are the Isolation Forest and the Autoencoder. Then there would be the signature-based ones, which are Random Forest and Deep Neural Network. And finally, a combination of both methodologies was performed for the development of a hybrid classifier using the Autoencoder and the Random Forest. From the results obtained by the techniques, it can be concluded that anomaly-based or signature-based techniques are not really techniques that can be implemented in an organization. However, the combination of these techniques does perform effective detection, as it is able to take advantage of the benefits of both techniques.

Rights

Atribución-NoComercial 3.0 España

Collections

Traballos académicos (FIC)

Full item page

Except where otherwise noted, this item's license is described as Atribución-NoComercial 3.0 España

Sistema de detección de anomalías basado en el análisis de flujos de red mediante técnicas de aprendizaje automático

Files

Identifiers

Publication date

Authors

Advisors

Other responsabilities

Journal Title

Bibliographic citation

Type of academic work

Academic degree

Abstract

Description

Keywords

Editor version

Rights

Collections