Use this link to cite:
http://hdl.handle.net/2183/31934 Sistema de detección de anomalías basado en el análisis de flujos de red mediante técnicas de aprendizaje automático
Loading...
Identifiers
Publication date
Authors
Padín Torrente, Héctor
Advisors
Other responsabilities
Universidade da Coruña. Facultade de Informática
Journal Title
Bibliographic citation
Type of academic work
Academic degree
Abstract
[Resumen] Los rápidos avances en el campo de Internet y las comunicaciones han dado lugar a un
enorme aumento del número de dispositivos conectados y el volumen de tráfico que discurre
por las redes. Con una sociedad cada vez más ’digitalizada’, las organizaciones se ven obligadas
a depender cada vez más de servicios de TI en general y de las redes de comunicaciones
en particular. Además, ciertos avances como el cloud computing han propiciado que estas
últimas reemplacen modelos tradicionales centrados en la seguridad perimetral por modelos
más abiertos y expuestos.
Y si bien estos avances han permitido, entre otras cosas, mejorar o mantener los niveles
de productividad de muchos sectores, también han propiciado que el valor de la información
y disponibilidad de los sistemas sea cada día mayor. Es por ello que se cada año se puede
observar de manera más sustancial el aumento tanto en el número de ciberataques, como en
su complejidad.
Una de las primeras líneas de defensa de las organizaciones para detectar las posibles intrusiones
en sus redes son los NIDS (Network Intrusion Detection Systems). En un intento de
detectar ataques o comportamientos inusuales, estos sistemas tradicionalmente, inspeccionaban
el contenido de cada paquete. Sin embargo, debido al gran aumento de tráfico en las
redes de comunicaciones junto con el incremento y evolución de los ciberataques han levado
a la introducción de estrategias más avanzadas como la detección de anomalías mediante el
paradigma de inspección basado en flujos.
La detección de anomalías consiste en modelizar el comportamiento normal de la red y
detectar las desviaciones o comportamientos inusuales dentro de esta. Para generar una línea
de base que refleje el comportamiento normal de la red de manera eficiente se utilizan técnicas
de aprendizaje automático (ML) y aprendizaje profundo (DL). De esta manera, cuando se
detectan patrones de tráfico anormales o actividades de red irregulares, estas técnicas alertan
al equipo de seguridad sobre la posible amenaza.
En este proyecto se presentan diferentes técnicas para la detección de anomalías en flujos
de red. Para ello se utilizó un dataset etiquetado, el CSE-CIC-IDS-2018, sobre el cual se
realizaron diferentes fases de exploración y preprocesado para facilitar el aprendizaje. Las
técnicas utilizadas pueden agruparse según la metodología de inspección para los NIDS. Las
basadas en anomalías, Isolation Forest y Autoencoder, y las basadas en firmas, Random Forest
y Deep Neural Network. Por último, se realizó una combinación de ambas metodologías para
el desarrollo de un clasificador híbrido utilizando el Autoencoder y el Random Forest.
De los resultados obtenidos por las técnicas se puede concluir que tanto la detección de
anomalías como la detección de firmas no son técnicas que realmente se puedan implementar
en una organización por sí solas. Sin embargo, la combinación de estas técnicas es capaz de
aprovechar las ventajas de ambas para llevar a cabo una detección eficaz y eficiente de los
diferentes ciberataques.
[Abstract] The rapid advances in the field of Internet and communications have resulted in a huge increase in the number of connected devices and the volume of traffic flowing over networks. With an increasingly ’digitized’ society, organizations are forced to rely more and more on IT services in general and communications networks in particular. In addition, advances such as cloud computing have led organizations to replace traditional models focused on perimeter security with more open and exposed models. And while these advances have made it possible, among other things, to improve or maintain productivity levels in many sectors, they have also increased the value of information and the availability of systems. This is why every year we see a substantial increase in the number and complexity of cyber-attacks. One of the first lines of defense for organizations to detect possible intrusions in their networks are the NIDS (Network Intrusion Detection Systems). In an attempt to detect attacks or unusual behavior, these systems traditionally inspected the contents of each packet. However, due to the large increase of traffic in communication networks along with the increase and evolution of cyber-attacks have led to the introduction of more advanced strategies such as anomaly detection using the flow-based inspection paradigm. Anomaly detection consists of modeling normal network behavior and detecting deviations or unusual behavior within the network. Machine learning (ML) and deep learning (DL) techniques are used to generate a baseline that reflects the normal behavior of the network in an efficient manner. Thus, when abnormal traffic patterns or irregular network activities are detected, these techniques alert the security team about the possible threat. In this project, different techniques for detecting anomalies in network flows are presented. For this purpose, a labeled dataset, the CSE-CIC-IDS-2018, was used, on which different exploration and preprocessing phases were performed to facilitate learning. The techniques used can be grouped according to the inspection methodology for the NIDS. Those based on anomalies are the Isolation Forest and the Autoencoder. Then there would be the signature-based ones, which are Random Forest and Deep Neural Network. And finally, a combination of both methodologies was performed for the development of a hybrid classifier using the Autoencoder and the Random Forest. From the results obtained by the techniques, it can be concluded that anomaly-based or signature-based techniques are not really techniques that can be implemented in an organization. However, the combination of these techniques does perform effective detection, as it is able to take advantage of the benefits of both techniques.
[Abstract] The rapid advances in the field of Internet and communications have resulted in a huge increase in the number of connected devices and the volume of traffic flowing over networks. With an increasingly ’digitized’ society, organizations are forced to rely more and more on IT services in general and communications networks in particular. In addition, advances such as cloud computing have led organizations to replace traditional models focused on perimeter security with more open and exposed models. And while these advances have made it possible, among other things, to improve or maintain productivity levels in many sectors, they have also increased the value of information and the availability of systems. This is why every year we see a substantial increase in the number and complexity of cyber-attacks. One of the first lines of defense for organizations to detect possible intrusions in their networks are the NIDS (Network Intrusion Detection Systems). In an attempt to detect attacks or unusual behavior, these systems traditionally inspected the contents of each packet. However, due to the large increase of traffic in communication networks along with the increase and evolution of cyber-attacks have led to the introduction of more advanced strategies such as anomaly detection using the flow-based inspection paradigm. Anomaly detection consists of modeling normal network behavior and detecting deviations or unusual behavior within the network. Machine learning (ML) and deep learning (DL) techniques are used to generate a baseline that reflects the normal behavior of the network in an efficient manner. Thus, when abnormal traffic patterns or irregular network activities are detected, these techniques alert the security team about the possible threat. In this project, different techniques for detecting anomalies in network flows are presented. For this purpose, a labeled dataset, the CSE-CIC-IDS-2018, was used, on which different exploration and preprocessing phases were performed to facilitate learning. The techniques used can be grouped according to the inspection methodology for the NIDS. Those based on anomalies are the Isolation Forest and the Autoencoder. Then there would be the signature-based ones, which are Random Forest and Deep Neural Network. And finally, a combination of both methodologies was performed for the development of a hybrid classifier using the Autoencoder and the Random Forest. From the results obtained by the techniques, it can be concluded that anomaly-based or signature-based techniques are not really techniques that can be implemented in an organization. However, the combination of these techniques does perform effective detection, as it is able to take advantage of the benefits of both techniques.
Description
Keywords
IDS NIDS Anomalía Aprendizaje automático Aprendizaje profundo Flujos Exploración de datos Ingeniería de características Detección de anomalías Autoencoder Detección de firmas Bosque aleatorio Detección híbrida Anomaly Machine learning Deep learning Flows Exploratory data analysis Feature engineering Anomaly detection Signature detection Random forest Hibrid detection
Editor version
Rights
Atribución-NoComercial 3.0 España







