Development of Vaccine with Artificial Intelligence: By Analyzing OP Code Features Based on Text and Image Dataset

OP Code 특징 기반의 텍스트와 이미지 데이터셋 연구를 통한 인공지능 백신 개발

  • Received : 2019.07.19
  • Accepted : 2019.09.16
  • Published : 2019.10.31


Due to limitations of existing methods for detecting newly introduced malware, the importance of the development of artificial intelligence vaccines arises. Existing artificial intelligence vaccines have a disadvantage that the accuracy of the detection rate is low because those vaccines do not scan all parts of the file. In this paper, we suggest an enhanced method for detecting malware which is composed of unique OP Code features in the malware files. Specifically, we tested the method with text datasets trained on Random Forest algorithm and with image datasets trained on the Inception V3 model. As a result, the highest accuracy of the detection rate was about 80%.


AI Vaccine;Intelligent Vaccine;OP Code feature;Text based;Image based


Supported by : 한국연구재단


  1. N. Idika and A.P. Mathur, "A survey of malware detection techniques," Department of Computer Science, Purdue University, Feb. 2007.
  2. T. Lee, B. Choi, Y. Shin, and J. Kwak, "Automatic malware mutant detection and group classification based on the n-gram and clustering coefficient," The Journal of Supercomputing, vol. 74, no. 8, pp. 3489-3503, Aug. 2018.
  3. Z. Cui, F. Xue, X. Cai, Y. Cao, G. Wang, and J. Chen, "Detection of malicious code variants based on deep learning," IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3187-3196, Jul. 2018.
  4. P.V. Shijo and A. Salim, "Integrated static and dynamic analysis for malware detection," Procedia Computer Science, vol. 46, pp. 804-811, Jan. 2015.
  5. I. Baptista, S. Shiaeles, and N. Kolokotronis, "A novel malware detection system based on machine learning and binary visualization," arXiv preprint arXiv:1904.00859, Apr. 2019.
  6. H. Cui, Y. Zhou, C. Wang, Q. Li, and K. Ren, "Towards privacy-preserving malware detection systems for android," 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), pp. 545-552, Dec. 2018.
  7. A. Pfeffer, B. Ruttenberg, L. Kellogg, M. Howard, C. Call, A. O'Connor, G. Takata, S.N. Reilly, T. Patten, J. Taylor, R. Hall, A. Lakhotia, C. Miles, D. Scofield, and J. Frank, "Artificial intelligence based malware analysis," arXiv preprint arXiv: 1704.08716, Apr. 2017.
  8. W. Fleshman, E. Raff, R. Zak, M. McLean, and C. Nicholas, "Static malware detection & subterfuge: quantifying the robustness of machine learning and current anti-virus," 2018 13th International Conference on Malicious and Unwanted Software (MALWARE), pp. 1-10, Oct. 2018.
  9. I. Firdausi, A. Erwin, and A.S. Nugroho, "Analysis of machine learning techniques used in behavior-based malware detection," 2010 second international conference on advances in computing, control, and telecommunication technologies, pp. 201-203, Dec. 2010.
  10. S. Khalid, T. Khalil, and S. Nasreen, "A survey of feature selection and feature extraction techniques in machine learning," 2014 Science and Information Conference, pp. 372-378, Aug. 2014.
  11. T. Abou-Assaleh, N. Cercone, V. Keselj, and R. Sweidan, "N-gram-based detection of new malicious code," Proceedings of the 28th Annual International Computer Software and Applications Conference, vol. 2, pp. 41-42, Sep. 2004.
  12. K. Simonyan, A. Vedaldi, and A. Zisserman, "Deep inside convolutional networks: visualising image classification models and saliency maps," arXiv preprint arXiv: 1312.6034v2, Apr. 2014.