Detection of Traffic Anomalities using Mining : An Empirical Approach

마이닝을 이용한 이상트래픽 탐지: 사례 분석을 통한 접근

  • 김정현 (한양대학교 전자통신컴퓨터공학부) ;
  • 안수한 (서울시립대학교 통계학과) ;
  • 원유집 (한양대학교 전자통신컴퓨터공학부) ;
  • 이종문 (국가보안기술연구소) ;
  • 이은영 (국가보안기술연구소)
  • Published : 2006.06.01

Abstract

In this paper, we collected the physical traces from high speed Internet backbone traffic and analyze the various characteristics of the underlying packet traces. Particularly, our work is focused on analyzing the characteristics of an anomalous traffic. It is found that in our data, the anomalous traffic is caused by UDP session traffic and we determined that it was one of the Denial of Service attacks. In this work, we adopted the unsupervised machine learning algorithm to classify the network flows. We apply the k-means clustering algorithm to train the learner. Via the Cramer-Yon-Misses test, we confirmed that the proposed classification method which is able to detect anomalous traffic within 1 second can accurately predict the class of a flow and can be effectively used in determining the anomalous flows.

본 논문에서는 실제 인터넷 백본으로부터 일주일간 캡쳐한 트래픽을 대상으로 기초 통계 분석을 하고, 여기서 발생한 이상트래픽을 분석한다. 이상트래픽은 국외에서 국내로 유입되는 UDP 기반 트래픽에서 나타났다. 트래픽 자료에 대한 탐색적 분석 결과 packets/sec 분포와 bytes/sec 분포에서 이상트래픽이 발생할 경우에 나타나는 새로운 형태의 특성이 발견되었다. 본 연구에서는 이러한 이상트래픽의 원인이 되는 플로우를 분류하기 위하여 자율학습(unsupervised learning) 방법의 하나인 분류분석(k-means clustering)을 이용하였으며, 분류된 플로우의 특성분석을 토대로 발생한 이상트래픽은 DoS 공격의 일종에 의한 것으로 결론지었다. 또한 본 연구에서는 이상트래픽의 원인이 되는 플로우의 존재 시점을 탐지하기 위하여 새로운 기법을 제시한다. 제시된 기법은 분포적합검정(goodness of fit test)의 한 방법인 Cramer-Von-Misses 검정에서 쓰이는 통계량에 바탕을 두고 있으며 1초 단위의 탐지기법이다. 제시된 기법의 응용 결과, 이상트래픽의 존재 시점으로 판단된 시점과 DoS 공격으로 판단된 플로우들의 시점이 일치함을 확인할 수 있었다.

Keywords

References

  1. D. Moore, G. M. Voelker, and S. Savage, 'Inferring Internet Denial-of-Service Activity,' presented at The 2001 USENIX Security Symposium, 2001
  2. M. Thottan and C. Ji, 'Anomaly Detection in IP Networks,' IEEE Transactions on Signal Processing, vol. Vol. 51, No.8, 2003 https://doi.org/10.1109/TSP.2003.814797
  3. A. Hussain, J. Heidemann, and C. Papadopoulos, 'A Framework for Classifying Denial of Service Attacks,' presented at SIGCOMM, 2003
  4. A. Lakhina, M. Crovella, and C. Diot, 'Characterization of Network-Wide Anomalies in Traffic Flows,' presented at The ACM/SIGCOMM Internet Measurement Conference, 2004 https://doi.org/10.1145/1028788.1028813
  5. W.-Leland, M.-Taqqu, W.-Willinger, and D.-Wilson, 'On the Self-Similar Nature of Ethernet Traffic,' in Computer Communication Review: ACM, 1992, pp. 203-213 https://doi.org/10.1145/167954.166255
  6. W. Walter, T. Murad, and S. Robert, 'Self-Similarity Through High-Variability: Statistical Analysis of Ethernet LAN Traffic at the Srouce Level,' in Proceedings of SIGCOMM '95. Cam-brideg, MA, USA, 1995, pp. 100-113 https://doi.org/10.1145/217391.217418
  7. A. C. G. A. -Feldmann and W. -Willinger, 'Data networks as cascades: Investigating the multi-fractal nature of Internet WAN traffic,' ACM Computer Communication Review, vol. 28 %8 Sept 1998, pp. 42-55, 1998 https://doi.org/10.1145/285243.285256
  8. V. Paxson, 'An analysis of using reflectors for distributed denial-of-service attacks,' presented at ACM SIGCOMM Computer Communication Review, 2001 https://doi.org/10.1145/505659.505664
  9. Z.-L. Zhang, V. J Ribeiro, S. Moon, and C. Diot, 'Small-Time Scaling Behaviors of Internet Backbone Traffic: An Empirical Study,' presented at IEEE INFOCOM, 2003 https://doi.org/10.1109/INFCOM.2003.1209205
  10. P. Barford and D. Plonka, 'Characteristics of Network Traffic Flow Anomalies,' presented at ACM Internet Measurement Workshop '01, San Francisco. CA, USA, 2001 https://doi.org/10.1145/505202.505211
  11. M. Crovella and E. Kolazcyk, 'Graph Wavelets for Spatial Traffic Analysis,' presented at IEEE INFOCOM, 2003 https://doi.org/10.1109/INFCOM.2003.1209207
  12. A. Lakhina, M. Crovella, and C. Diot, 'Diagnosing Network-Wide Traffic Anomalies,' presented at ACM SIGCOMM 2004, 2004 https://doi.org/10.1145/1030194.1015492
  13. A. Lakhina, M. Crovella, and C. Diot, 'Mining Anomalies Using Traffic Feature Distributions' https://doi.org/10.1145/1080091.1080118
  14. K. Xu, Z.-L. Zhang, and S. Bhattacharyya, 'Profiling Interent Backbone Traffic: Behavior Models and Applications,' presented at ACM SIGCOMM, Philadelphia, Pennsylvania, USA, 2005 https://doi.org/10.1145/1080091.1080112
  15. L. Feinstein, D. Schnackenberg, R. Balupari, and D. Kindred, 'Statistical Approaches to DDoS Attack Detection and Response,' presented at The DARPA information Survivability Conference and Exposition (DlSCEX'03), 2003
  16. C. Z. Cliff, G. Weibo, T. Don, and G. Lixin, 'Monitoring and Early Detection of Internet Worms,' IEEE/ACM Trans. on Networking https://doi.org/10.1109/TNET.2005.857113
  17. tcpdump/lipcap, 'TCPDUMP public repository,' in http://www.tcpdump.org
  18. Ethereal, 'The world's most popular network protocol analyzer,' in http://www.ethereal.com/.
  19. Sprint, 'Packet Trace Analysis,' in http://ipmon.sprint.com/packstat/packetoverview.php
  20. MySQL, 'The World's Most Popular Open Source Database,' in http://www.mysql.com
  21. T. Darmohray and R. Oliver, ''Hot Spares' For DoS Attacks,' in http://www.usenix.org/publications/login/200-7/apropos.html. ;login:, 2000
  22. R. R. Panko, Corporate Computer and Network Security: Prentice Hall, 2004
  23. R. A. Johnson and D. W. Wichern, Applied multivariate statistical analysis, 5 ed: Prentice Hall, 2002
  24. Anderson and Darling, 'Asymptotic theory of certain goodness of fit criteria based on stochastic process,' Annals of Mathematical Statistics, vol. 23, pp, 193-212, 1952 https://doi.org/10.1214/aoms/1177729437