DOI QR코드

DOI QR Code

Clasification of Cyber Attack Group using Scikit Learn and Cyber Treat Datasets

싸이킷런과 사이버위협 데이터셋을 이용한 사이버 공격 그룹의 분류

  • 김경신 (청강문화산업대학교 모바일IT스쿨) ;
  • 이호준 (청강문화산업대학교 모바일IT스쿨) ;
  • 김성희 ((주)디지털트윈) ;
  • 김병익 (한국인터넷진흥원 보안기술R&D팀) ;
  • 나원식 (남서울대학교 컴퓨터소프트웨어학과) ;
  • 김동욱 ((주)엔코디) ;
  • 이정환 ((주)에이아이)
  • Received : 2018.10.12
  • Accepted : 2018.12.20
  • Published : 2018.12.31

Abstract

The most threatening attack that has become a hot topic of recent IT security is APT Attack.. So far, there is no way to respond to APT attacks except by using artificial intelligence techniques. Here, we have implemented a machine learning algorithm for analyzing cyber threat data using machine learning method, using a data set that collects cyber attack cases using Scikit Learn, a big data machine learning framework. The result showed an attack classification accuracy close to 70%. This result can be developed into the algorithm of the security control system in the future.

최근 IT보안의 화두가 되고 있는 가장 위협적인 공격은 APT공격이다. APT공격에 대한 대응은 인공지능기법을 활용한 대응이외에는 방법이 없다는 것이 현재까지의 결론이다. 여기서는 머신러닝 기법을 활용한 사이버위협 데이터를 분석하는 방법, 그 중에서도 빅데이터 머신러닝 프레임웍인 Scikit Learn를 활용하여 사이버공격 사례를 수집한 데이터셋을 이용하여 사이버공격을 분석하는 머신러닝 알고리즘을 구현하였다. 이 결과 70%에 육박하는 공격 분류 정확도를 보였다. 이 결과는 향후 보안관제 시스템의 알고리즘으로 발전가능하다.

Keywords

JKOHBZ_2018_v8n6_165_f0001.png 이미지

Fig. 1. MapReduce Structure

JKOHBZ_2018_v8n6_165_f0002.png 이미지

Fig. 2. Hadoop Install – openssh-server

JKOHBZ_2018_v8n6_165_f0003.png 이미지

Fig. 3. Spark Install – Variable Edit

JKOHBZ_2018_v8n6_165_f0004.png 이미지

Fig. 4. Function Structure

JKOHBZ_2018_v8n6_165_f0005.png 이미지

Fig. 5. Create Session

JKOHBZ_2018_v8n6_165_f0006.png 이미지

Fig. 6. Loading Datasets

JKOHBZ_2018_v8n6_165_f0007.png 이미지

Fig. 7. Result of Sorting

JKOHBZ_2018_v8n6_165_f0008.png 이미지

Fig. 8. Result of Clustrings

JKOHBZ_2018_v8n6_165_f0009.png 이미지

Fig. 9. Algorithm Flow

JKOHBZ_2018_v8n6_165_f0010.png 이미지

Fig. 10. Datasets Classfy

JKOHBZ_2018_v8n6_165_f0011.png 이미지

Fig. 11. Keywords

JKOHBZ_2018_v8n6_165_f0012.png 이미지

Fig. 12. Datasets ReAllocate

JKOHBZ_2018_v8n6_165_f0013.png 이미지

Fig. 13. Result of Experiment

References

  1. Malware Images: Visualization and Automatic Classification, https://vision.ece.ucsb.edu/research/signal-processing-malware-analysis
  2. S. H. Seok. (2016). Malware Family Classify of Convolution Neural Network using Imagification. Journal of the Korea Institute of Information Security & Cryptology, 26(1).
  3. H. J. Kim & E. J. Yoon. (2017). AI Deep Learning protection of Malware Imagification. Journal of The Institute of Electronics and Information Engineers, 54(2).
  4. J. H. Kwon. (2011). Malware detection of Various code using Action Graph. Security of Information Society Journal, 21(2).
  5. C. K. Kong. (2011). Malware Host Detection using Spam Mail Analysis. Korea Internet & Security Agency Final Report.
  6. K. S. Kim. (2018). Malware Analysis Algorithm using Machine Learning. International Journal of Engineering & Technology, 7(2.12), 80-83.
  7. T. K. Kwon. (2016). Maleware Various Group Classfy using Data Mining. Korea Internet & Security Agency Final Report.
  8. E. K. Yang. (2010). Deveop of Performance Factor and Collect of Malware Analysis. Korea Internet & Security Agency Final Report.
  9. J. S. Moon. (2010). Neutralization Algorithm Study using Execution Self-Compression file. Korea Internet & Security Agency Final Report.
  10. B. I, Kim. (2018), A Study on Cyber Threat Intelligence Analysis (CTI) Platform for Proactive Detection of Cyber Attacks Based on Automated Analysis. The Journal of Korea Telecom Society, Fall Symposium, 578-579.
  11. B. I, Kim. (2016), A Study on the ID Management System of Cyber Threat and its Relevant Information for Cyber Threat Intelligent Analysis. The Journal of Korea Telecom Society, Winter Symposium, 959-960.
  12. Daesung Moon, Hansung Lee, (2014), "Feature Extraction for Host based Anomaly Detection", The Journal of Korea Electronics Society, Summer Symposium, 591-594
  13. D. H. Kim & K. S. Kim. (2018). DGA-DNS Similarity Analysis and APT Attack Detection Using N-gram. The Journal of Korea Computer Secret Society, 28(5), 591-594.
  14. D. G. Kim & C. H. Kim. (2018). Study on APT Attack Response Techniques Based on Big Data Analysis. The Journal of Society of Convergence Knowledge, 4(1), 29-34.
  15. Splunk Product Bries. (2018). Splunk Enterprise Security. https://www.splunk.com/pdfs/product-briefs/splunk-enterprise-security.pdf