DOI QR코드

DOI QR Code

A Study on Variant Malware Detection Techniques Using Static and Dynamic Features

  • Kang, Jinsu (Dept. of Computer Science Engineering, Chungnam National University) ;
  • Won, Yoojae (Dept. of Computer Science Engineering, Chungnam National University)
  • Received : 2019.11.12
  • Accepted : 2020.04.14
  • Published : 2020.08.31

Abstract

The amount of malware increases exponentially every day and poses a threat to networks and operating systems. Most new malware is a variant of existing malware. It is difficult to deal with numerous malware variants since they bypass the existing signature-based malware detection method. Thus, research on automated methods of detecting and processing variant malware has been continuously conducted. This report proposes a method of extracting feature data from files and detecting malware using machine learning. Feature data were extracted from 7,000 malware and 3,000 benign files using static and dynamic malware analysis tools. A malware classification model was constructed using multiple DNN, XGBoost, and RandomForest layers and the performance was analyzed. The proposed method achieved up to 96.3% accuracy.

Keywords

References

  1. AV-TEST, "AV-TEST Annual Report," 2018 [Online]. Available: https://www.av-test.org/en/about-theinstitute/publications/.
  2. H. Rathore, S. Agarwal, S. K. Sahay, and M. Sewak, "Malware detection using machine learning and deep learning," in Big Data Analytics. Cham: Springer, 2018, pp. 402-411.
  3. R. Zak, E. Raff, and C. Nicholas, "What can N-grams learn for malware detection?," in Proceedings of 2017 12th International Conference on Malicious and Unwanted Software (MALWARE), Fajardo, Puerto Rico, 2017, pp. 109-118.
  4. I. Santos, F. Brezo, X. Ugarte-Pedrero, and P. G. Bringas, "Opcode sequences as representation of executables for data-mining-based unknown malware detection," Information Sciences, vol. 231, pp. 64-82, 2013. https://doi.org/10.1016/j.ins.2011.08.020
  5. R. Moskovitch, C. Feher, N. Tzachar, E. Berger, M. Gitelman, S. Dolev, and Y. Elovici, "Unknown malcode detection using opcode representation," Intelligence and Security Informatics. Heidelberg: Springer, 2018, pp. 204-215.
  6. A. Yewale and M. Singh, "Malware detection based on opcode frequency," in Proceedings of 2016 International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), Ramanathapuram, India, 2016, pp. 646-649.
  7. Y. Ki, E. Kim, and H. K. Kim, "A novel approach to detect malware based on API call sequence analysis," International Journal of Distributed Sensor Networks, vol. 11, no. 6, article no. 659101, 2015.
  8. M. A. Jerlin and K. Marimuthu, "A new malware detection system using machine learning techniques for API call sequences," Journal of Applied Security Research, vol. 13, no. 1, pp. 45-62, 2018. https://doi.org/10.1080/19361610.2018.1387734
  9. M. Ahmadi, D. Ulyanov, S. Semenov, M. Trofimov, and G. Giacinto, "Novel feature extraction, selection and fusion for effective malware family classification," in Proceedings of the 6th ACM Conference on Data and Application Security and Privacy, New Orleans, LA, 2016, pp. 183-194.
  10. K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov, "Learning and classification of malware behavior," in Detection of Intrusions and Malware, and Vulnerability Assessment. Heidelberg: Springer, 2008, pp. 108-125.
  11. A. Mohaisen, O. Alrawi, and M. Mohaisen, "AMAL: high-fidelity, behavior-based automated malware analysis and classification," Computers & Security, vol. 52, pp. 251-266, 2015. https://doi.org/10.1016/j.cose.2015.04.001
  12. S. Nari and A. A. Ghorbani, "Automated malware classification based on network behavior," in Proceedings of 2013 International Conference on Computing, Networking and Communications (ICNC), San Diego, CA, 2013, pp. 642-647.
  13. J. Saxe and K. Berlin, "Deep neural network based malware detection using two dimensional binary program features," in Proceedings of 2015 10th International Conference on Malicious and Unwanted Software (MALWARE), Fajardo, Puerto Rico, 2015, pp. 11-20.
  14. X. Ma, S. Guo, H. Li, Z. Pan, J. Qiu, Y. Ding, and F. Chen, "How to make attention mechanisms more practical in malware classification," IEEE Access, vol. 7, pp. 155270-155280, 2019. https://doi.org/10.1109/access.2019.2948358
  15. Z. Cui, F. Xue, X. Cai, Y. Cao, G. G. Wang, and J. Chen, "Detection of malicious code variants based on deep learning," IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3187-3196, 2018. https://doi.org/10.1109/tii.2018.2822680
  16. D. Gibert, C. Mateu, J. Planes, and R. Vicens, "Using convolutional neural networks for classification of malware represented as images," Journal of Computer Virology and Hacking Techniques, vol. 15, no. 1, pp. 15-28, 2019. https://doi.org/10.1007/s11416-018-0323-0
  17. L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath, "Malware images: visualization and automatic classification," in Proceedings of the 8th international Symposium on Visualization for Cyber Security, Pittsburgh, PA, 2011, pp. 1-7.
  18. L. Chen, "Deep transfer learning for static malware classification," 2018 [Online]. Available: https://arxiv.org/abs/1812.07606.
  19. S. Gupta, H. Sharma, and S. Kaur, "Malware characterization using windows API call sequences," in Security, Privacy, and Applied Cryptography Engineering. Cham: Springer, 2016, pp. 271-280.
  20. Korea Information Security Industry Service, "Large malware dataset," 2018 [Online]. Available:https://www.kisis.or.kr/kisis/subIndex/376.do.