DOI QR코드

DOI QR Code

Extraction and classification of characteristic information of malicious code for an intelligent detection model

지능적 탐지 모델을 위한 악의적인 코드의 특징 정보 추출 및 분류

  • Hwang, Yoon-Cheol (Department of Talmage Liberal Arts.Convergence College, Hannam University)
  • 황윤철 (한남대학교 탈메이지 교양.융합대학)
  • Received : 2022.03.11
  • Accepted : 2022.05.20
  • Published : 2022.05.28

Abstract

In recent years, malicious codes are being produced using the developing information and communication technology, and it is insufficient to detect them with the existing detection system. In order to accurately and efficiently detect and respond to such intelligent malicious code, an intelligent detection model is required, and in order to maximize detection performance, it is important to train with the main characteristic information set of the malicious code. In this paper, we proposed a technique for designing an intelligent detection model and generating the data required for model training as a set of key feature information through transformation, dimensionality reduction, and feature selection steps. And based on this, the main characteristic information was classified by malicious code. In addition, based on the classified characteristic information, we derived common characteristic information that can be used to analyze and detect modified or newly emerging malicious codes. Since the proposed detection model detects malicious codes by learning with a limited number of characteristic information, the detection time and response are fast, so damage can be greatly reduced and Although the performance evaluation result value is slightly different depending on the learning algorithm, it was found through evaluation that most malicious codes can be detected.

최근에는 발전하는 정보통신 기술을 이용하여 악의적인 코드들이 제작되고 있고 이를 기존 탐지 시스템으로는 탐지하는게 역부족인 실정이다. 이러한 지능적이고 악의적인 코드를 정확하고 효율성 있게 탐지하고 대응하기 위해서는 지능적 탐지 모델이 필요하다. 그리고, 탐지 성능을 최대로 높이기 위해서는 악의적인 코드의 주요 특징 정보 집합으로 훈련하는 것이 중요하다. 본 논문에서는 지능적 탐지 모델을 설계하고 모델 훈련에 필요한 데이터를 변환, 차원축소, 특징 선택 단계를 거쳐 주요 특징 정보 집합으로 생성하는 기법을 제안하였다. 그리고 이를 기반으로 악의적인 코드별로 주요 특징 정보를 분류하였다. 또한, 분류된 특징 정보들을 기반으로 변형되거나 새로 등장하는 악의적인 코드를 분석하고 탐지하는데 사용할 수 있는 공통 특징 정보를 도출하였다. 제안된 탐지 모델은 제한된 수의 특성 정보로 학습하여 악의적인 코드를 탐지하기에 탐지 시간과 대응이 빨리 이루어져 피해를 크게 줄일 수 있다. 그리고, 성능 평가 결과값은 학습 알고리즘에 따라 약간 차이가 나지만 악의적인 코드 대부분을 탐지할 수 있음을 평가로 알 수 있었다.

Keywords

Acknowledgement

This work was supported by 2021 Hannam University Research Fund.

References

  1. AVTEST. (2021). https://www.av-test.org/en/statistics/malware/.
  2. Symantec, Symantec internet security threat report. (2018). ISTR-23-2018
  3. Chionis, I., Nikolopoulos, S. D. & Polenakis I. (2013). A Survey on Algorithmic Techniques for Malware Detection. Proc. 2nd Int'l Symposium on Computing in Informatics and Mathematics (ISCIM'13). 29-34.
  4. Xu, X. & Wang, X. (2005). An Adaptive Network Intrusion Detection Method Based on PCA and Support Vector Machines. Advanced Data Mining and Applications, Springer, 696-703.
  5. M. Egele, T. S. Scholte, E. Kirda & C. Kruegel. (2012). A survey on automated dynamic malware-analysis techniques and tools. ACM Computing Surveys(CSUR). 44(2), 1-42.
  6. H. J. Gwon, S. W. Kim & E. G. Lim. (2012). An Malware Classification System using Multi N-gram. Journal of Security Engineering, 9(6), 531-542.
  7. H. S. Seo. J. S. Choi & P. H. Chu. (2009). Design of Classification Methodology of Malicious Code in Windows Environment. Journal of the Korea Institute of Information Security & Cryptology 19(2). 83-92. DOI: 10.13089/JKIISC.2009.19.2.83
  8. Cyber Threat Trend Report. (2021). KISA. https://www.krcert.or.kr/data/reportView.do?bulletin_writing_sequence=36189
  9. Cyber Threat Trend Report. (2021). KISA. https://www.krcert.or.kr/data/reportView.do?bulletin_writing_sequence=36076
  10. M. Sikorski & A. Honig. (2012). Practical Malware Analysis: the hands-on guide to dissecting malicious software. No Starch Press.
  11. Chionis, I. Nikolopoulos, S. D. & Polenakis I. (2013). A Survey on Algorithmic Techniques for Malware Detection. Proc. 2nd Int'l Symposium on Computing in Informatics and Mathematics (ISCIM'13). 29-34.
  12. W. K. Lee, M. J. Lee & D. S. Seo. (2020). Application of Machine Learning Techniques for the Classification of Source Code Vulnerability. Journal of The Korea institute of imformation security & cryptology, 30(4), 735-743.
  13. Sihwail, R., Omar, K. & Ariffin, K. Z. (2018). A survey on malware analysis techniques: Static, dynamic, hybrid and memory analysis. Int. J. Adv. Sci. Eng. Inf. Technol, 8(4-2), 1662-1671. DOI: 10.18517/ijaseit.8.4-2.6827
  14. K. Rieck, T. Holz, C. Willems, P. Dussel & P. Laskov. (2008). Leaming and classification of malware behavior. in Proceedings of the 5th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 108-125.
  15. NSL-KDD dataset [online] available: http://nsl.cs.unb.ca/nsl-kdd/
  16. VX Heaven. (2018). http://83.133.184.251/virensimulation.org
  17. Wang, W., Zhang, X. & Gombault, S. (2009). Constructing Attribute Weights from Computer Audit Data for Effective Intrusion Detection. Journal of Systems and Software, 82, 1974-1981. https://doi.org/10.1016/j.jss.2009.06.040
  18. S. Cateni, et al. (2012). Variable Selection and Feature Extraction through Artificial Intelligence Techniques, Multivariate Analysis in Management. Engineering and the Science, chapter 6, 103-118.
  19. Makandar, A. & Patrot, A. (2015). Malware analysis and classification using artificial neural network. IEEE. In 2015 International conference on trends in automation, communications and computing technology (I-TACT-15). 1-6.
  20. P. Manandhar. (2014). A Practical Approach to Anomaly-based Intrusion Detection System by Outlier Mining in Network Traffic. Masdar Institute of Science and Technology.