• Title/Summary/Keyword: data anomaly detection

Search Result 382, Processing Time 0.028 seconds

Comparison of Anomaly Detection Performance Based on GRU Model Applying Various Data Preprocessing Techniques and Data Oversampling (다양한 데이터 전처리 기법과 데이터 오버샘플링을 적용한 GRU 모델 기반 이상 탐지 성능 비교)

  • Yoo, Seung-Tae;Kim, Kangseok
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.201-211
    • /
    • 2022
  • According to the recent change in the cybersecurity paradigm, research on anomaly detection methods using machine learning and deep learning techniques, which are AI implementation technologies, is increasing. In this study, a comparative study on data preprocessing techniques that can improve the anomaly detection performance of a GRU (Gated Recurrent Unit) neural network-based intrusion detection model using NGIDS-DS (Next Generation IDS Dataset), an open dataset, was conducted. In addition, in order to solve the class imbalance problem according to the ratio of normal data and attack data, the detection performance according to the oversampling ratio was compared and analyzed using the oversampling technique applied with DCGAN (Deep Convolutional Generative Adversarial Networks). As a result of the experiment, the method preprocessed using the Doc2Vec algorithm for system call feature and process execution path feature showed good performance, and in the case of oversampling performance, when DCGAN was used, improved detection performance was shown.

AN ANOMALY DETECTION METHOD BY ASSOCIATIVE CLASSIFICATION

  • Lee, Bum-Ju;Lee, Heon-Gyu;Ryu, Keun-Ho
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.301-304
    • /
    • 2005
  • For detecting an intrusion based on the anomaly of a user's activities, previous works are concentrated on statistical techniques or frequent episode mining in order to analyze an audit data. But, since they mainly analyze the average behaviour of user's activities, some anomalies can be detected inaccurately. Therefore, we propose an anomaly detection method that utilizes an associative classification for modelling intrusion detection. Finally, we proof that a prediction model built from associative classification method yields better accuracy than a prediction model built from a traditional methods by experimental results.

  • PDF

Effective Dimensionality Reduction of Payload-Based Anomaly Detection in TMAD Model for HTTP Payload

  • Kakavand, Mohsen;Mustapha, Norwati;Mustapha, Aida;Abdullah, Mohd Taufik
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.8
    • /
    • pp.3884-3910
    • /
    • 2016
  • Intrusion Detection System (IDS) in general considers a big amount of data that are highly redundant and irrelevant. This trait causes slow instruction, assessment procedures, high resource consumption and poor detection rate. Due to their expensive computational requirements during both training and detection, IDSs are mostly ineffective for real-time anomaly detection. This paper proposes a dimensionality reduction technique that is able to enhance the performance of IDSs up to constant time O(1) based on the Principle Component Analysis (PCA). Furthermore, the present study offers a feature selection approach for identifying major components in real time. The PCA algorithm transforms high-dimensional feature vectors into a low-dimensional feature space, which is used to determine the optimum volume of factors. The proposed approach was assessed using HTTP packet payload of ISCX 2012 IDS and DARPA 1999 dataset. The experimental outcome demonstrated that our proposed anomaly detection achieved promising results with 97% detection rate with 1.2% false positive rate for ISCX 2012 dataset and 100% detection rate with 0.06% false positive rate for DARPA 1999 dataset. Our proposed anomaly detection also achieved comparable performance in terms of computational complexity when compared to three state-of-the-art anomaly detection systems.

The Bayesian Framework based on Graphics for the Behavior Profiling (행위 프로파일링을 위한 그래픽 기반의 베이지안 프레임워크)

  • 차병래
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.5
    • /
    • pp.69-78
    • /
    • 2004
  • The change of attack techniques paradigm was begun by fast extension of the latest Internet and new attack form appearing. But, Most intrusion detection systems detect only known attack type as IDS is doing based on misuse detection, and active correspondence is difficult in new attack. Therefore, to heighten detection rate for new attack pattern, the experiments to apply various techniques of anomaly detection are appearing. In this paper, we propose an behavior profiling method using Bayesian framework based on graphics from audit data and visualize behavior profile to detect/analyze anomaly behavior. We achieve simulation to translate host/network audit data into BF-XML which is behavior profile of semi-structured data type for anomaly detection and to visualize BF-XML as SVG.

Online anomaly detection algorithm based on deep support vector data description using incremental centroid update (점진적 중심 갱신을 이용한 deep support vector data description 기반의 온라인 비정상 탐지 알고리즘)

  • Lee, Kibae;Ko, Guhn Hyeok;Lee, Chong Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.2
    • /
    • pp.199-209
    • /
    • 2022
  • Typical anomaly detection algorithms are trained by using prior data. Thus the batch learning based algorithms cause inevitable performance degradation when characteristics of newly incoming normal data change over time. We propose an online anomaly detection algorithm which can consider the gradual characteristic changes of incoming normal data. The proposed algorithm based on one-class classification model includes both offline and online learning procedures. In offline learning procedure, the algorithm learns the prior data to be close to centroid of the latent space and then updates the centroid of the latent space incrementally by new incoming data. In the online learning, the algorithm continues learning by using the updated centroid. Through experiments using public underwater acoustic data, the proposed online anomaly detection algorithm takes only approximately 2 % additional learning time for the incremental centroid update and learning. Nevertheless, the proposed algorithm shows 19.10 % improvement in Area Under the receiver operating characteristic Curve (AUC) performance compared to the offline learning model when new incoming normal data comes.

Techniques for Improving Host-based Anomaly Detection Performance using Attack Event Types and Occurrence Frequencies

  • Juyeon Lee;Daeseon Choi;Seung-Hyun Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.89-101
    • /
    • 2023
  • In order to prevent damages caused by cyber-attacks on nations, businesses, and other entities, anomaly detection techniques for early detection of attackers have been consistently researched. Real-time reduction and false positive reduction are essential to promptly prevent external or internal intrusion attacks. In this study, we hypothesized that the type and frequency of attack events would influence the improvement of anomaly detection true positive rates and reduction of false positive rates. To validate this hypothesis, we utilized the 2015 login log dataset from the Los Alamos National Laboratory. Applying the preprocessed data to representative anomaly detection algorithms, we confirmed that using characteristics that simultaneously consider the type and frequency of attack events is highly effective in reducing false positives and execution time for anomaly detection.

Anomaly Detection in Medical Wireless Sensor Networks

  • Salem, Osman;Liu, Yaning;Mehaoua, Ahmed
    • Journal of Computing Science and Engineering
    • /
    • v.7 no.4
    • /
    • pp.272-284
    • /
    • 2013
  • In this paper, we propose a new framework for anomaly detection in medical wireless sensor networks, which are used for remote monitoring of patient vital signs. The proposed framework performs sequential data analysis on a mini gateway used as a base station to detect abnormal changes and to cope with unreliable measurements in collected data without prior knowledge of anomalous events or normal data patterns. The proposed approach is based on the Mahalanobis distance for spatial analysis, and a kernel density estimator for the identification of abnormal temporal patterns. Our main objective is to distinguish between faulty measurements and clinical emergencies in order to reduce false alarms triggered by faulty measurements or ill-behaved sensors. Our experimental results on both real and synthetic medical datasets show that the proposed approach can achieve good detection accuracy with a low false alarm rate (less than 5.5%).

MLOps workflow language and platform for time series data anomaly detection

  • Sohn, Jung-Mo;Kim, Su-Min
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.19-27
    • /
    • 2022
  • In this study, we propose a language and platform to describe and manage the MLOps(Machine Learning Operations) workflow for time series data anomaly detection. Time series data is collected in many fields, such as IoT sensors, system performance indicators, and user access. In addition, it is used in many applications such as system monitoring and anomaly detection. In order to perform prediction and anomaly detection of time series data, the MLOps platform that can quickly and flexibly apply the analyzed model to the production environment is required. Thus, we developed Python-based AI/ML Modeling Language (AMML) to easily configure and execute MLOps workflows. Python is widely used in data analysis. The proposed MLOps platform can extract and preprocess time series data from various data sources (R-DB, NoSql DB, Log File, etc.) using AMML and predict it through a deep learning model. To verify the applicability of AMML, the workflow for generating a transformer oil temperature prediction deep learning model was configured with AMML and it was confirmed that the training was performed normally.

Comparison and Analysis of Anomaly Detection Methods for Detecting Data Exfiltration (데이터 유출 탐지를 위한 이상 행위 탐지 방법의 비교 및 분석)

  • Lim, Wongi;Kwon, Koohyung;Kim, Jung-Jae;Lee, Jong-Eon;Cha, Si-Ho
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.9
    • /
    • pp.440-446
    • /
    • 2016
  • Military secrets or confidential data of any organization are extremely important assets. They must be discluded from outside. To do this, methods for detecting anomalous attacks and intrusions inside the network have been proposed. However, most anomaly-detection methods only cover aspects of intrusion from outside and do not deal with internal leakage of data, inflicting greater damage than intrusions and attacks from outside. In addition, applying conventional anomaly-detection methods to data exfiltration creates many problems, because the methods do not consider a number of variables or the internal network environment. In this paper, we describe issues considered in data exfiltration detection for anomaly detection (DEDfAD) to improve the accuracy of the methods, classify the methods as profile-based detection or machine learning-based detection, and analyze their advantages and disadvantages. We also suggest future research challenges through comparative analysis of the issues with classification of the detection methods.

Sequence Anomaly Detection based on Diffusion Model (확산 모델 기반 시퀀스 이상 탐지)

  • Zhiyuan Zhang;Inwhee, Joe
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.2-4
    • /
    • 2023
  • Sequence data plays an important role in the field of intelligence, especially for industrial control, traffic control and other aspects. Finding abnormal parts in sequence data has long been an application field of AI technology. In this paper, we propose an anomaly detection method for sequence data using a diffusion model. The diffusion model has two major advantages: interpretability derived from rigorous mathematical derivation and unrestricted selection of backbone models. This method uses the diffusion model to predict and reconstruct the sequence data, and then detects the abnormal part by comparing with the real data. This paper successfully verifies the feasibility of the diffusion model in the field of anomaly detection. We use the combination of MLP and diffusion model to generate data and compare the generated data with real data to detect anomalous points.