Network Classification of P2P Traffic with Various Classification Methods

Han, Seokwan;Hwang, Jinsoo;

doi:10.5351/KJAS.2015.28.1.001

The Korean Journal of Applied Statistics (응용통계연구)

Volume 28 Issue 1
/
Pages.1-8
/
2015
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

Network Classification of P2P Traffic with Various Classification Methods

다양한 분류기법을 이용한 네트워크상의 P2P 데이터 분류실험

Han, Seokwan (Department of Statistics, Inha University) ;
Hwang, Jinsoo (Department of Statistics, Inha University)

한석완 (인하대학교 통계학과) ;
황진수 (인하대학교 통계학과)

Received : 2014.08.21
Accepted : 2014.12.24
Published : 2015.02.28

https://doi.org/10.5351/KJAS.2015.28.1.001 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Security has become an issue due to the rapid increases in internet traffic data network. Especially P2P traffic data poses a great challenge to network systems administrators. Preemptive measures are necessary for network quality of service(QoS) and efficient resource management like blocking suspicious traffic data. Deep packet inspection(DPI) is the most exact way to detect an intrusion but it may pose a private security problem that requires time. We used several machine learning methods to compare the performance in classifying network traffic data accurately over time. The Random Forest method shows an excellent performance in both accuracy and time.

인터넷 트래픽의 증가로 인하여 네트워크의 보안 문제가 중요한 문제로 대두되고 있다. 그 중에서도 특히 P2P 트래픽의 증가는 모든 서버의 관리자에게는 해결해야할 중요한 문제로 대두되고 있다. 서버에서 네트워크 트래픽을 조사하여 문제가 있는 트래픽을 미리 차단하는 것은 서비스 품질의 향상과 자원의 효율적인 사용 측면에서 바람직하나 오가는 패킷의 내부정보를 조사하는 것은 개인정보보호 차원에서 문제가 있을 수 있으며 시간과 노력이 많이 소요되므로 요즘은 통계적인 기계학습의 방법을 이용하여 이상 트래픽을 찾아내는 연구가 주를 이루고 있다. 본 연구에서는 최근의 기계학습방법 중에서 널리 쓰이는 방법들을 비교 연구하여 그 결과 랜덤포리스트(random forest)라고 불리는 방법의 우수함을 보였다.

Keywords

References

Breiman, L. (2001). Random forest, Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
Dainotti, A., Donato W. D., Pescape, A. and Rossi, P. S. (2008). Classification of network traffic via packetlevel hidden Markov models, In Proceedings of IEEE Global Telecommunications Conference, November.
Karagiannis, T., Broido, A., Brownlee, N. and Claffy, K. Is P2P dying or just hiding?, In Proceedings 47th annual IEEE Global Telecommunications Conference (Globecom 2004), Dallas, Texas, USA, November/ December 2004.
Nguyen Thuy, T. T. and Armitage, G. (2008). A survey of techniques for internet traffic classification using machine learning, IEEE Communications Surveys and Tutorials, 10, 56-76. https://doi.org/10.1109/SURV.2008.080406
Mu., X., Wu, W. and Enabled C. (2011). A parallelized Network traffic classification based on hidden Markov model, Distributed Computing and Knowledge Discovery, October.
Munz, G., Dai, H., Braum, L. and Carle, G. (2010). TCP traffic classification using Markov models, TMA'10 Proceedings of the Second International Conference, 127-140.
Vapnik, V., Golowich, S. and Smola, A. (1977). Support vector method for function approximation, regression estimation and signal processing, Advances in Neural Information Processing Systems, 9, 281-287.
Zhang, J., Xiang, Y., Wang Y., Zhou, W., Xiang, Y. and Guan, Y. (2013). Network traffic classification using correlation information, IEEE Transactions on Parallel and Distributed Systems, 24, 104-117. https://doi.org/10.1109/TPDS.2012.98
http://www.simpleweb.org/wiki/Traces

Cited by

Choosing clusters for two-stage household surveys vol.27, pp.2, 2016, https://doi.org/10.7465/jkdi.2016.27.2.363

The Korean Journal of Applied Statistics (응용통계연구)

Network Classification of P2P Traffic with Various Classification Methods

다양한 분류기법을 이용한 네트워크상의 P2P 데이터 분류실험

Abstract

Keywords

References

Cited by

Detail Search