DOI QR코드

DOI QR Code

Malware Detection Method using Opcode and windows API Calls

Opcode와 Windows API를 사용한 멀웨어 탐지

  • Received : 2017.09.28
  • Accepted : 2017.12.08
  • Published : 2017.12.31

Abstract

We proposed malware detection method, which use the feature vector that consist of Opcode(operation code) and Windows API Calls extracted from executable files. And, we implemented our feature vector and measured the performance of it by using Bernoulli Naïve Bayes and K-Nearest Neighbor classifier. In experimental result, when using the K-NN classifier with the proposed method, we obtain 95.21% malware detection accuracy. It was better than existing methods using only either Opcode or Windows API Calls.

본 논문에서는 멀웨어 탐지 방법으로 Opcode (operation code)와 실행 파일에서 추출한 Windows API Call로 구성된 특징 벡터를 사용하는 방법을 제안한다. 먼저 PE 파일에서 추출한 opcode와 windows API로 특징 벡터를 구성하고 Bernoulli Naïve Bayes과 K-Nearest Neighbor 분류기 알고리즘을 사용하여 성능을 각각 측정하였다. 실험결과, 제안한 방법과 KNN 분류기를 사용하여 분류하면 95.21%의 멀웨어 탐지 정확도를 얻을 수 있었다. 결과적으로 기존의 Opcode 또는 Windows API 호출 중 하나만 사용하는 방법보다 제안한 방법이 멀웨어 탐지 정확도에서 높은 성능을 보인다.

Keywords

References

  1. G. Bala Krishna, V. Radha, K. Venugopala Rao, "Review of Contemporary Literature on Machine Learning based Malware Analysis and Detection Strategies," Global Journal of Computer Science and Technology, vol. 16, Issue. 5, version 1.0, pp 11-16, 2016.
  2. B. Kolosnjaji, A. Zarras, G. Webster, C. Eckert, "Deep Learning for Classification of Malware System Call sequences," in Australasian Joint Conference on Artificial Intelligence, pp 137-149, 2016.
  3. Z. Bu et al., McAfee Threats Report: Second Quarter 2012, McAfee Labs, 2012.
  4. Ye, Yanfang, et al. "A Survey on Malware Detection Using Data Mining Techniques," ACM Computing Surveys (CSUR) vol.50,no.3, 41p, 2017. DOI: http://doi.org/10.1145/3073559
  5. analysis method, https://software.intel.com/
  6. Seung-Won Lee, Reversing Important Principles: Malware analyst's reversing talk, Insight, pp 141-143, 2012.
  7. I. Santos, F. Brezo, X. Ugarte-Pedrero, PG. Bringas, "Opcode Sequences as Representation of Executables for data-mining-based unknown malware detection," Information Sciences, vol. 231, pp. 64-82, 2013. DOI: http://doi.org/10.1016/j.ins.2011.08.020
  8. M. Alazab, R. Layton, S. Venkataraman, P. Watters, "Malware detection based on structural and behavioural features of api calls", School of Computer and Information Science, Security Research Centre, Edith Cowan University, Perth, Western Australia, 2010.
  9. M. Alazab, S. Venkatraman, P. Watters, M. Alazab, "Zero-day malware detection based on supervised learning algorithms of API call signatures", Proceedings of the Ninth Australasian Data Mining Conference-Volume 121, pp. 171-182, 2011.
  10. Jeong-been Park, Kyoung-Soo Han, Eul-Gyu Im, "Malware Classification Using Worth Opcodes," Proceedings of the Korea Information Science 2014 Korea Computer Conference, pp943-945, Jun, 2014.
  11. Yu-Jin Shim, Eul-Gyu Im, "Malware Detection And Classification System based on API Call Sequence," Ph.D. Thesis. University of Hanyang, Seoul, Republic of Korea 2016.
  12. Python Library, scikit-learn, Bernoulli naive bayes, http://scikit-learn.org/stable/modules/naive_bayes.html.
  13. Galit Shmueli, Nitin R. Patel, Peter C. Bruce, Data Mining for Business Intelligence, E&Bplus, pp 166, 2006.
  14. Kwon, Y. M., Lee, I. R., Kim, M. G., "A Study on Clustering of SNS SPAM using Heuristic Method", The Journal of The Institute of Internet, Broadcasting and Communication, 14.6, pp 7-12, 2014 DOI: http://doi.org/10.7236/JIIBC.2014.14.6.7
  15. E. Carrera, Pefile, https://github.com/erocarrera/pefile.
  16. Capstone, capstone, http://www.capstone-engine.org.
  17. virusshare, https://virusshare.com.
  18. joxeankoret, http://malwareurls.joxeankoret.com.
  19. malc0de, http://malc0de.com.
  20. malwareblacklist, http://www.malwareblacklist.com.