DOI QR코드

DOI QR Code

Detection of Malicious Code using Association Rule Mining and Naive Bayes classification

연관규칙 마이닝과 나이브베이즈 분류를 이용한 악성코드 탐지

  • Ju, Yeongji (Dept. of Software Convergence Engineering, Chosun University) ;
  • Kim, Byeongsik (Dept. of Software Convergence Engineering, Chosun University) ;
  • Shin, Juhyun (Dept. of Software Convergence Engineering, Chosun University)
  • Received : 2017.09.05
  • Accepted : 2017.10.18
  • Published : 2017.11.30

Abstract

Although Open API has been invigorated by advancements in the software industry, diverse types of malicious code have also increased. Thus, many studies have been carried out to discriminate the behaviors of malicious code based on API data, and to determine whether malicious code is included in a specific executable file. Existing methods detect malicious code by analyzing signature data, which requires a long time to detect mutated malicious code and has a high false detection rate. Accordingly, in this paper, we propose a method that analyzes and detects malicious code using association rule mining and an Naive Bayes classification. The proposed method reduces the false detection rate by mining the rules of malicious and normal code APIs in the PE file and grouping patterns using the DHP(Direct Hashing and Pruning) algorithm, and classifies malicious and normal files using the Naive Bayes.

Keywords

References

  1. H.N. Kim, J.K. Park, and Y.H. Won, "A Study on the Malware Realtime Analysis Systems Using the Finite Automata," Journal of the Korea Society of Computer and Information, Vol. 18, No. 5, pp. 69-76, 2013. https://doi.org/10.9708/jksci.2013.18.5.069
  2. S. Jo, "Evolution of Malicious Code for Corresponding Technology and Standardization Trend," Telecommunications Technology Association Journal, No. 118, pp. 47-57, 2008.
  3. Worm, https://goo.gl/Qyy6UE (accessed Sep., 15, 2017).
  4. KISA, Internet and Security Focus, 2015.
  5. H. Ji, J. Choi, S. Kim, and B. Min, "Signature Effectiveness Description Scheme," Proceeding of General Spring Conference of Korea Multimedia Society, Vol. 13, No. 1, pp. 41-43, 2016.
  6. B. Jung, K. Han, and E. Im, "Malicious Code Status and Detection Technology," Proceeding of Information Science Society, Vol. 30, No. 1, pp. 44-53, 2012.
  7. C.S. Park, “An Email Vaccine Cloud System for Detecting Malcode-Bearing Documents,” Journal of Korea Multimedia Society, Vol, 13, No. 5, pp. 754-762, 2010.
  8. R. Agrawal and R. Srikant, "Fast Algorithms for Mining Association Rules," Proceeding of 20th International Conference on Very Large Data-bases, pp. 478-499, 2016.
  9. J. Bell, Machine Learning: Hands-On for Developers and Technical Professionals, John Wiley Sons, Hobokon, NJ, 2014.
  10. J.S. Park, M.S. Chen, and P.S. Yu, "An Effective Hash-Based Algorithm for Mining Association Rules," Proceeding of Association for Computing Machinery's Special Interest Group on Management Of Data, pp. 175-186, 1995.
  11. H.B. Lee and J.H. Kim, “Performance Evaluation of the FP-tree and the DHP Algorithms for Association Rule Mining,” Journal of Korean Institude of Information Scientists and Engineers, Vol. 35, No. 3, pp. 199-207, 2008.
  12. J.Y. Choi, H.S. Kim, K.I. Kim, H.S. Park, and J.S. Song, "A Study on Extraction of Optimized API Sequence Length and Combination for Efficient Malware Classification," Journal of the Korea Institute of Information Security and Cryptology, Vol. 204, No. 5, pp. 897-909, 2014.
  13. J.H. Kwon, J.H. Lee, H.C. Jeong, and H.J. Lee, "Metamorphic Malware Detection Using Subgraph Matching," Journal of the Korea Institute of Information Security and Cryptology, Vol. 21, No. 2, pp. 37-47, 2011.
  14. B.J. Han, Y.H. Choi, and B.C. Bae, “Generating Malware DNA to Classify the Similar Malwares,” Journal of the Korea Institute of Information Security and Cryptology, Vol. 23, No. 4, pp. 679-694, 2013. https://doi.org/10.13089/JKIISC.2013.23.4.679
  15. I.K. Cho and E.G. Im, "Malware Family Recommendation Using Multiple Sequence Alignment," Journal of Korean Institude of Information Scientists and Engineers, Vol. 43, No. 3, pp. 289-295, 2016.
  16. O.C. Kwon, S.J. Bae, J.I. Cho, and J.S. Moon, "Malicious Codes Re-grouping Methods Using Fuzzy Clustering Based on Native API Frequency," Journal of The Korea Institute of Information Security and Cryptology, Vol. 18, No. 6, pp. 115-127, 2008.
  17. J.W. Park, S.T. Moon, G.W. Son, I.K. Kim, K.S. Han, and E.G. Im, et al., “An Automatic Malware Classification System Using String List and APIs,” Journal of Security Engineering, Vol. 8, No. 5, pp. 611-626, 2011.
  18. K.S. Han, I.K. Kim, and E.G. Im, “Malware Family Classification Method Using API Sequential Characteristic,” Journal of Security Engineering, Vol. 8, No. 2, pp. 319-335, 2011.
  19. H. Lee, Structure and Principles of Windows System Executable, Hanbit Media, Seodaemun-gu, Seoul, 2005.

Cited by

  1. Development and Comparison of Data Mining-based Prediction Models of Building Fire Probability vol.19, pp.6, 2017, https://doi.org/10.7472/jksii.2018.19.6.101