Use this link to cite:
http://hdl.handle.net/2183/38332 Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model
Loading...
Identifiers
Publication date
Authors
Advisors
Other responsabilities
Journal Title
Bibliographic citation
Corbelle, C.; Carneiro, V.; Cacheda, F. Semantic Hierarchical Classification Applied to Anomaly Detection Using System Logs with a BERT Model. Appl. Sci. 2024, 14, 5388. https://doi.org/10.3390/app14135388
Type of academic work
Academic degree
Abstract
[Abstract]: The compaction and structuring of system logs facilitate and expedite anomaly and cyberattack detection processes using machine-learning techniques, while simultaneously reducing alert fatigue caused by false positives. In this work, we implemented an innovative algorithm that employs hierarchical codes based on the semantics of natural language, enabling the generation of a significantly reduced log that preserves the semantics of the original. This method uses codes that reflect the specificity of the topic and its position within a higher hierarchical structure. By applying this catalog to the analysis of logs from the Hadoop Distributed File System (HDFS), we achieved a concise summary with non-repetitive themes, significantly speeding up log analysis and resulting in a substantial reduction in log size while maintaining high semantic similarity. The resulting log has been validated for anomaly detection using the “bert-base-uncased” model and compared with six other methods: PCA, IM, LogCluster, SVM, DeepLog, and LogRobust. The reduced log achieved very similar values in precision, recall, and F1-score metrics, but drastically reduced processing time.
Description
Editor version
Rights
Atribución 4.0 Internacional (CC-BY 4.0)







