Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model
Use este enlace para citar
http://hdl.handle.net/2183/38332
A non ser que se indique outra cousa, a licenza do ítem descríbese como Atribución 4.0 Internacional (CC-BY 4.0)
Coleccións
Metadatos
Mostrar o rexistro completo do ítemTítulo
Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT ModelData
2024-06Cita bibliográfica
Corbelle, C.; Carneiro, V.; Cacheda, F. Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model. Appl. Sci. 2024, 14, 5388. https://doi.org/10.3390/app14135388
Resumo
[Abstract]: The compaction and structuring of system logs facilitate and expedite anomaly and cyberattack detection processes using machine-learning techniques, while simultaneously reducing alert fatigue caused by false positives. In this work, we implemented an innovative algorithm that employs hierarchical codes based on the semantics of natural language, enabling the generation of a significantly reduced log that preserves the semantics of the original. This method uses codes that reflect the specificity of the topic and its position within a higher hierarchical structure. By applying this catalog to the analysis of logs from the Hadoop Distributed File System (HDFS), we achieved a concise summary with non-repetitive themes, significantly speeding up log analysis and resulting in a substantial reduction in log size while maintaining high semantic similarity. The resulting log has been validated for anomaly detection using the “bert-base-uncased” model and compared with six other methods: PCA, IM, LogCluster, SVM, DeepLog, and LogRobust. The reduced log achieved very similar values in precision, recall, and F1-score metrics, but drastically reduced processing time.
Palabras chave
System logs
Anomaly detection
BERT model
Hierarchical codes
Semantic similarity
Anomaly detection
BERT model
Hierarchical codes
Semantic similarity
Versión do editor
Dereitos
Atribución 4.0 Internacional (CC-BY 4.0)
ISSN
2076-3417