Security tendency analysis techniques through machine learning algorithms applications in big data environments

빅데이터 환경에서 기계학습 알고리즘 응용을 통한 보안 성향 분석 기법

  • 최도현 (숭실대학교 컴퓨터학과) ;
  • 박중오 (동양미래대학 정보통신공학과)
  • Received : 2015.07.20
  • Accepted : 2015.09.20
  • Published : 2015.09.28


Recently, with the activation of the industry related to the big data, the global security companies have expanded their scopes from structured to unstructured data for the intelligent security threat monitoring and prevention, and they show the trend to utilize the technique of user's tendency analysis for security prevention. This is because the information scope that can be deducted from the existing structured data(Quantify existing available data) analysis is limited. This study is to utilize the analysis of security tendency(Items classified purpose distinction, positive, negative judgment, key analysis of keyword relevance) applying the machine learning algorithm($Na{\ddot{i}}ve$ Bayes, Decision Tree, K-nearest neighbor, Apriori) in the big data environment. Upon the capability analysis, it was confirmed that the security items and specific indexes for the decision of security tendency could be extracted from structured and unstructured data.


  1. TechNavio, Global Threat Intelligence Security Market 2014-2018, TechNavio (Infiniti Research Ltd.), 2014.
  2. Lee-Moongoo, Bae-Chunsock, Next Generation Convergence Security Framework for Advanced Persistent Threat, Journal of The Institute of Electronics Engineers of Korea, Vol. 50, No. 9, pp 92-99, 2013.
  3. Jeon-Deokjo, Park-Donggue, Analysis Model for Prediction of Cyber Threats by Utilizing Big Data Technology, JKIIT, Vol. 12, No. 5, pp. 81-100, 2014.
  4. Chung-Yongwook, Noh-Bongnam, The weight analysis research in developing a similarity classification problem of malicious code based on attributes, Journal of The Korea Institute of Information Security & Cryptology, Vol. 23, No. 3, pp. 501-514, 2013.
  5. Park-Hyeongyu, Situation awareness based intelligent security technology research and development trends, Institute for Information & communications Technology Promotion, p.18, ICT Planning Series Week Technology Trends, 2015.
  6. Im-Sujong, Min-Okgi, Machine Learning Technology Trends for Big Data Processing, Electronics and Telecommunications Research Institute, p.56, Electronics and Telecommunications Trends, 2012.
  7. Mitchell, An Introduction to Genetic Algorithms, p.48, The MIT Press, 1996.
  8. Lee-Jaegu, Lee-Taehoon, Yoon-Sungro, Machine Learning for Big Data analysis, Korean Institute of Communication and Information Sciences, Vol. 31, No. 11, pp 14-26, 2014.
  9. Jang-Byeongtak Next-Generation Machine Learning Technologies, Korean Institute of Information Scientists and Engineers, Vol. 25, No. 3, pp 96-107, 2007.
  10. Steven Bird, Ewan Klein, and Edward Loper, Natural Language Processing with Python, p.201, O'Reilly Media, 2014.
  11. Ethem Alpaydin, Introduction to Machine Learning, second edition, pp 20-32, The MIT Press, 2010.
  12. Mitchell, Tom Michael, The discipline of machine learning, Machine Learning Department technical report, p.6, 2006.
  13. Andrew McCallum, and Kamal Nigam, A comparison of event models for naive bayes text classification, AAAI-98 workshop on learning for text categorization, Vol. 752, pp. 41-48, 1998.
  14. S. B. Kotsiantis, Supervised machine learning: A review of classification techniques, An International Journal of Computing and Informatics, Vol. 31, No. 3, pp. 3-24, 2007.
  15. Blum, Avrim L and Pat Langley. Selection of relevant features and examples in machine learning, Artificial intelligence 97.1, pp. 245-271, 1997.
  16. Dietterich, Thomas G, An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization, Machine learning 40.2, pp. 139-157, 2000.
  17. Zhang, Min-Ling, and Zhi-Hua Zhou, ML-KNN: A lazy learning approach to multi-label learning, Pattern recognition 40.7, pp. 2038-2048, 2007.
  18. Jovanoski, Viktor, and Nada Lavrac, Classification rule learning with APRIORI-C, Springer Berlin Heidelberg, pp. 44-51, 2001.