• Title/Summary/Keyword: UNSW-NB15

Search Result 12, Processing Time 0.021 seconds

Classification Performance Improvement of UNSW-NB15 Dataset Based on Feature Selection (특징선택 기법에 기반한 UNSW-NB15 데이터셋의 분류 성능 개선)

  • Lee, Dae-Bum;Seo, Jae-Hyun
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.5
    • /
    • pp.35-42
    • /
    • 2019
  • Recently, as the Internet and various wearable devices have appeared, Internet technology has contributed to obtaining more convenient information and doing business. However, as the internet is used in various parts, the attack surface points that are exposed to attacks are increasing, Attempts to invade networks aimed at taking unfair advantage, such as cyber terrorism, are also increasing. In this paper, we propose a feature selection method to improve the classification performance of the class to classify the abnormal behavior in the network traffic. The UNSW-NB15 dataset has a rare class imbalance problem with relatively few instances compared to other classes, and an undersampling method is used to eliminate it. We use the SVM, k-NN, and decision tree algorithms and extract a subset of combinations with superior detection accuracy and RMSE through training and verification. The subset has recall values of more than 98% through the wrapper based experiments and the DT_PSO showed the best performance.

Malicious Traffic Classification Using Mitre ATT&CK and Machine Learning Based on UNSW-NB15 Dataset (마이터 어택과 머신러닝을 이용한 UNSW-NB15 데이터셋 기반 유해 트래픽 분류)

  • Yoon, Dong Hyun;Koo, Ja Hwan;Won, Dong Ho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.12 no.2
    • /
    • pp.99-110
    • /
    • 2023
  • This study proposed a classification of malicious network traffic using the cyber threat framework(Mitre ATT&CK) and machine learning to solve the real-time traffic detection problems faced by current security monitoring systems. We applied a network traffic dataset called UNSW-NB15 to the Mitre ATT&CK framework to transform the label and generate the final dataset through rare class processing. After learning several boosting-based ensemble models using the generated final dataset, we demonstrated how these ensemble models classify network traffic using various performance metrics. Based on the F-1 score, we showed that XGBoost with no rare class processing is the best in the multi-class traffic environment. We recognized that machine learning ensemble models through Mitre ATT&CK label conversion and oversampling processing have differences over existing studies, but have limitations due to (1) the inability to match perfectly when converting between existing datasets and Mitre ATT&CK labels and (2) the presence of excessive sparse classes. Nevertheless, Catboost with B-SMOTE achieved the classification accuracy of 0.9526, which is expected to be able to automatically detect normal/abnormal network traffic.

Intrusion Detection System based on Packet Payload Analysis using Transformer

  • Woo-Seung Park;Gun-Nam Kim;Soo-Jin Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.11
    • /
    • pp.81-87
    • /
    • 2023
  • Intrusion detection systems that learn metadata of network packets have been proposed recently. However these approaches require time to analyze packets to generate metadata for model learning, and time to pre-process metadata before learning. In addition, models that have learned specific metadata cannot detect intrusion by using original packets flowing into the network as they are. To address the problem, this paper propose a natural language processing-based intrusion detection system that detects intrusions by learning the packet payload as a single sentence without an additional conversion process. To verify the performance of our approach, we utilized the UNSW-NB15 and Transformer models. First, the PCAP files of the dataset were labeled, and then two Transformer (BERT, DistilBERT) models were trained directly in the form of sentences to analyze the detection performance. The experimental results showed that the binary classification accuracy was 99.03% and 99.05%, respectively, which is similar or superior to the detection performance of the techniques proposed in previous studies. Multi-class classification showed better performance with 86.63% and 86.36%, respectively.

Learning Memory-Guided Normality with Only Normal Training Data for Novelty Detection in Network Data (네트워크 이상치 탐지를 위한 정상 데이터만을 활용한 메모리 기반 정상성 학습)

  • Lee, Geonsu;Lee, Hochang;Sim, Jaehoon;Koo, Hyung Il;Cho, Nam Ik
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.11a
    • /
    • pp.83-86
    • /
    • 2020
  • 본 논문에서는 네트워크 이상치 탐지를 위하여 정상 데이터만을 활용한 메모리 기반 정상성 학습 모델을 제안한다. 오토인코더를 기반으로 정상 데이터의 특징을 표현하는 프로토타입을 생성할 수 있도록 신경망을 구성하고, 네트워크 데이터의 특성을 반영하여 쿼리의 수를 한 개로 고정하며, 사용되는 프로토타입의 수를 지정한 값으로 고정하여 모든 프로토타입에 정상 데이터의 특징을 반영할 수 있는 학습 방법을 제안한다. 해당 모델을 네트워크 이상치 탐지 데이터 세트인 Kyoto Honeypot, UNSW-NB15, CICIDS-2018에 적용하여 본 결과 Kyoto Honeypot에서는 0.821, UNSW-NB15에서는 0.854, CICIDS-2018에서는 0.981의 AUROC를 달성했다.

  • PDF

Machine Learning Based Hybrid Approach to Detect Intrusion in Cyber Communication

  • Neha Pathak;Bobby Sharma
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.190-194
    • /
    • 2023
  • By looking the importance of communication, data delivery and access in various sectors including governmental, business and individual for any kind of data, it becomes mandatory to identify faults and flaws during cyber communication. To protect personal, governmental and business data from being misused from numerous advanced attacks, there is the need of cyber security. The information security provides massive protection to both the host machine as well as network. The learning methods are used for analyzing as well as preventing various attacks. Machine learning is one of the branch of Artificial Intelligence that plays a potential learning techniques to detect the cyber-attacks. In the proposed methodology, the Decision Tree (DT) which is also a kind of supervised learning model, is combined with the different cross-validation method to determine the accuracy and the execution time to identify the cyber-attacks from a very recent dataset of different network attack activities of network traffic in the UNSW-NB15 dataset. It is a hybrid method in which different types of attributes including Gini Index and Entropy of DT model has been implemented separately to identify the most accurate procedure to detect intrusion with respect to the execution time. The different DT methodologies including DT using Gini Index, DT using train-split method and DT using information entropy along with their respective subdivision such as using K-Fold validation, using Stratified K-Fold validation are implemented.

Traffic Data Generation Technique for Improving Network Attack Detection Using Deep Learning (네트워크 공격 탐지 성능향상을 위한 딥러닝을 이용한 트래픽 데이터 생성 연구)

  • Lee, Wooho;Hahm, Jaegyoon;Jung, Hyun Mi;Jeong, Kimoon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.11
    • /
    • pp.1-7
    • /
    • 2019
  • Recently, various approaches to detect network attacks using machine learning have been studied and are being applied to detect new attacks and to increase precision. However, the machine learning method is dependent on feature extraction and takes a long time and complexity. It also has limitation of performace due to learning data imbalance. In this study, we propose a method to solve the degradation of classification performance due to imbalance of learning data among the limit points of detection system. To do this, we generate data using Generative Adversarial Networks (GANs) and propose a classification method using Convolutional Neural Networks (CNNs). Through this approach, we can confirm that the accuracy is improved when applied to the NSL-KDD and UNSW-NB15 datasets.

Improving prediction performance of network traffic using dense sampling technique (밀집 샘플링 기법을 이용한 네트워크 트래픽 예측 성능 향상)

  • Jin-Seon Lee;Il-Seok Oh
    • Smart Media Journal
    • /
    • v.13 no.6
    • /
    • pp.24-34
    • /
    • 2024
  • If the future can be predicted from network traffic data, which is a time series, it can achieve effects such as efficient resource allocation, prevention of malicious attacks, and energy saving. Many models based on statistical and deep learning techniques have been proposed, and most of these studies have focused on improving model structures and learning algorithms. Another approach to improving the prediction performance of the model is to obtain a good-quality data. With the aim of obtaining a good-quality data, this paper applies a dense sampling technique that augments time series data to the application of network traffic prediction and analyzes the performance improvement. As a dataset, UNSW-NB15, which is widely used for network traffic analysis, is used. Performance is analyzed using RMSE, MAE, and MAPE. To increase the objectivity of performance measurement, experiment is performed independently 10 times and the performance of existing sparse sampling and dense sampling is compared as a box plot. As a result of comparing the performance by changing the window size and the horizon factor, dense sampling consistently showed a better performance.

Enhancing cloud computing security: A hybrid machine learning approach for detecting malicious nano-structures behavior

  • Xu Guo;T.T. Murmy
    • Advances in nano research
    • /
    • v.15 no.6
    • /
    • pp.513-520
    • /
    • 2023
  • The exponential proliferation of cutting-edge computing technologies has spurred organizations to outsource their data and computational needs. In the realm of cloud-based computing environments, ensuring robust security, encompassing principles such as confidentiality, availability, and integrity, stands as an overarching imperative. Elevating security measures beyond conventional strategies hinges on a profound comprehension of malware's multifaceted behavioral landscape. This paper presents an innovative paradigm aimed at empowering cloud service providers to adeptly model user behaviors. Our approach harnesses the power of a Particle Swarm Optimization-based Probabilistic Neural Network (PSO-PNN) for detection and recognition processes. Within the initial recognition module, user behaviors are translated into a comprehensible format, and the identification of malicious nano-structures behaviors is orchestrated through a multi-layer neural network. Leveraging the UNSW-NB15 dataset, we meticulously validate our approach, effectively characterizing diverse manifestations of malicious nano-structures behaviors exhibited by users. The experimental results unequivocally underscore the promise of our method in fortifying security monitoring and the discernment of malicious nano-structures behaviors.

Network intrusion detection method based on matrix factorization of their time and frequency representations

  • Chountasis, Spiros;Pappas, Dimitrios;Sklavounos, Dimitris
    • ETRI Journal
    • /
    • v.43 no.1
    • /
    • pp.152-162
    • /
    • 2021
  • In the last few years, detection has become a powerful methodology for network protection and security. This paper presents a new detection scheme for data recorded over a computer network. This approach is applicable to the broad scientific field of information security, including intrusion detection and prevention. The proposed method employs bidimensional (time-frequency) data representations of the forms of the short-time Fourier transform, as well as the Wigner distribution. Moreover, the method applies matrix factorization using singular value decomposition and principal component analysis of the two-dimensional data representation matrices to detect intrusions. The current scheme was evaluated using numerous tests on network activities, which were recorded and presented in the KDD-NSL and UNSW-NB15 datasets. The efficiency and robustness of the technique have been experimentally proved.

Experimental Comparison of Network Intrusion Detection Models Solving Imbalanced Data Problem (데이터의 불균형성을 제거한 네트워크 침입 탐지 모델 비교 분석)

  • Lee, Jong-Hwa;Bang, Jiwon;Kim, Jong-Wouk;Choi, Mi-Jung
    • KNOM Review
    • /
    • v.23 no.2
    • /
    • pp.18-28
    • /
    • 2020
  • With the development of the virtual community, the benefits that IT technology provides to people in fields such as healthcare, industry, communication, and culture are increasing, and the quality of life is also improving. Accordingly, there are various malicious attacks targeting the developed network environment. Firewalls and intrusion detection systems exist to detect these attacks in advance, but there is a limit to detecting malicious attacks that are evolving day by day. In order to solve this problem, intrusion detection research using machine learning is being actively conducted, but false positives and false negatives are occurring due to imbalance of the learning dataset. In this paper, a Random Oversampling method is used to solve the unbalance problem of the UNSW-NB15 dataset used for network intrusion detection. And through experiments, we compared and analyzed the accuracy, precision, recall, F1-score, training and prediction time, and hardware resource consumption of the models. Based on this study using the Random Oversampling method, we develop a more efficient network intrusion detection model study using other methods and high-performance models that can solve the unbalanced data problem.