DOI QR코드

DOI QR Code

A Chi-Square-Based Decision for Real-Time Malware Detection Using PE-File Features

  • Received : 2015.03.12
  • Accepted : 2015.08.25
  • Published : 2016.12.31

Abstract

The real-time detection of malware remains an open issue, since most of the existing approaches for malware categorization focus on improving the accuracy rather than the detection time. Therefore, finding a proper balance between these two characteristics is very important, especially for such sensitive systems. In this paper, we present a fast portable executable (PE) malware detection system, which is based on the analysis of the set of Application Programming Interfaces (APIs) called by a program and some technical PE features (TPFs). We used an efficient feature selection method, which first selects the most relevant APIs and TPFs using the chi-square ($KHI^2$) measure, and then the Phi (${\varphi}$) coefficient was used to classify the features in different subsets, based on their relevance. We evaluated our method using different classifiers trained on different combinations of feature subsets. We obtained very satisfying results with more than 98% accuracy. Our system is adequate for real-time detection since it is able to categorize a file (Malware or Benign) in 0.09 seconds.

Keywords

References

  1. C. V. Zhou, C. Leckie, and S. Karunasekera, "A survey of coordinated attacks and collaborative intrusion detection," Computers & Security, vol. 29, no. 1, pp.124-140, 2010. https://doi.org/10.1016/j.cose.2009.06.008
  2. McAfee threat report: first quarter 2013 [Online]. Available: http://www.mcafee.com/uk/resources/reports/rpquarterly-threat-q1-2013.pdf.
  3. McAfee Labs threats report: first quarter 2014 [Online]. Available: http://www.mcafee.com/uk/resources/reports/rp-quarterly-threat-q1-2014.pdf.
  4. W3Schools, "OS platform statistics," [Online]. http://www.w3schools.com/browsers/browsers_os.asp.
  5. M. Pietrek, "Peering inside the PE: a tour of the win32 portable executable file format," Microsoft Systems Journal, vol. 9, no. 3, pp. 15-38, 1994.
  6. Y. Ye, T. Li, Q. Jiang, and Y. Wang, "CIMDS: adapting postprocessing techniques of associative classification for malware detection," IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, vol. 40, no. 3, pp. 298-307, 2010. https://doi.org/10.1109/TSMCC.2009.2037978
  7. A. Shabtai, R. Moskovitch, Y. Elovici, and C. Glezer, "Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey," Information Security Technical Report, vol. 14, no. 1, pp. 16-29, 2009. https://doi.org/10.1016/j.istr.2009.03.003
  8. Y. Ye, D. Wang, T. Li, D. Ye, and Q. Jiang, "An intelligent PE-malware detection system based on association mining," Journal in Computer Virology, vol. 4, no. 4, pp. 323-334, 2008. https://doi.org/10.1007/s11416-008-0082-4
  9. "Anti-malware vendors slow to respond," Computer Fraud & Security, vol. 2010, no. 6, pp. 1-2, 2010. https://doi.org/10.1016/S1361-3723(10)70060-9
  10. Z. Salehi, A. Sami, and M. Ghiasi, "Using feature generation from API calls for malware detection," Computer Fraud & Security, vol. 2014, no. 9, pp. 9-18, 2014. https://doi.org/10.1016/S1361-3723(14)70531-7
  11. Z. Bazrafshan, H. Hashemi, S. M. H. Fard, and A. Hamzeh, "A survey on heuristic malware detection techniques," in Proceedings of the 5th Conference on Information and knowledge Technology (IKT), Shiraz, Iran, 2013, pp. 113-120.
  12. M. Z. Shafiq, S. M. Tabish, F. Mirza, and M. Farooq, "Pe-miner: mining structural information to detect malicious executables in realtime," in Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection, Saint-Malo, France, 2009, pp. 121-141.
  13. C. Wang, J. Pang, R. Zhao, and X. Liu, "Using API sequence and Bayes algorithm to detect suspicious behavior," in Proceedings of the International Conference on Communication Software and Network, Macau, China, 2009, pp. 544-548.
  14. Y. Ding, X. Yuan, K. Tang, X. Xiao, and Y. Zhang, "A fast malware detection algorithm based on objectiveoriented association mining," Computers & Security, vol. 39, pp. 315-324, 2013. https://doi.org/10.1016/j.cose.2013.08.008
  15. M. K. Shankarpani, K. Kancherla, R. Movva, and S. Mukkamala, "Computational intelligent techniques and similarity measures for malware classification," in Computational Intelligence for Privacy and Security. Heidelberg: Springer, 2012, pp. 215-236.
  16. M. G. Schultz, E. Eskin, F. Zadok, and S. J. Stolfo, "Data mining methods for detection of new malicious executables," in Proceedings of the IEEE Symposium onSecurity and Privacy, Oakland, CA, 2001, pp. 38-49.
  17. H. Toderici and M. Stamp, "Chi-squared distance and metamorphic virus detection," Journal of Computer Virology and Hacking Techniques, vol. 9, no. 1, pp. 1-14, 2013. https://doi.org/10.1007/s11416-012-0171-2
  18. P. Fornasini, "The chi square test," in The Uncertainty in Physical Measurements. New York: Springer, 2008, pp. 187-198.
  19. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 3rd ed. Burlington, MA: Morgan Kaufmann Publishers, 2011.
  20. pefile [Online], http://code.google.com/p/pefile/.
  21. S. Kokoska and C. Nevison, Statistical Tables and Formulae. New York: Springer, 1989.
  22. B. Chedzoy, "Phi-coefficient," in Encyclopedia of Statistical Sciences, 2nd ed. Hoboken, NJ: Wiley, 2006.
  23. D. P. Farrington and R. Loeber, "Relative improvement over chance (RIOC) and phi as measures of predictive efficiency and strength of association in 2x2 tables," Journal of Quantitative Criminology, vol. 5, no. 3, pp. 201-213, 1989. https://doi.org/10.1007/BF01062737
  24. J. Rodriguez, L. I. Kuncheva, and C. J. Alonso, "Rotation forest: A new classifier ensemble method," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 10, pp. 1619-1630, 2006. https://doi.org/10.1109/TPAMI.2006.211
  25. L. Breiman, "Random forest," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001. https://doi.org/10.1023/A:1010933404324
  26. A. Singh and A. Lakhotia, "Game-theoretic design of an information exchange model for detecting packed malware," in Proceedings of the 6th International Conference on Malicious and Unwanted Software (MALWARE), Fajardo, Puerto Rico, 2011, pp. 1-7.