DOI QR코드

DOI QR Code

Classification of Traffic Flows into QoS Classes by Unsupervised Learning and KNN Clustering

  • Zeng, Yi (San Diego Supercomputer Center, University of California) ;
  • Chen, Thomas M. (School of Engineering, Swansea University)
  • Published : 2009.04.25

Abstract

Traffic classification seeks to assign packet flows to an appropriate quality of service(QoS) class based on flow statistics without the need to examine packet payloads. Classification proceeds in two steps. Classification rules are first built by analyzing traffic traces, and then the classification rules are evaluated using test data. In this paper, we use self-organizing map and K-means clustering as unsupervised machine learning methods to identify the inherent classes in traffic traces. Three clusters were discovered, corresponding to transactional, bulk data transfer, and interactive applications. The K-nearest neighbor classifier was found to be highly accurate for the traffic data and significantly better compared to a minimum mean distance classifier.

Keywords

References

  1. Thuy Nguyen and Grenville Armitage, “A survey of techniques for Internet traffic classification using machine learning,” IEEE Communications Surveys and Tutorials, vo.10, no.4, pp.56-76, 2008. https://doi.org/10.1109/SURV.2008.080406
  2. H. Trussell, A. Nilsson, P. Patel, and Y. Wang, “Estimation and detection of network traffic,” in Proc. of 11th Digital Signal Processing Workshop, pp.246-248, 2004.
  3. Anthony McGregor, Mark Hall, Perry Lorier, and James Brunskill, “Flow clustering using machine learning techniques,” in Proc. of 5th Int. Workshop on Passive and Active Network Measurement, pp.205-214, 2004.
  4. Sebastian Zander, Thuy Nguyen, and Grenville Armitage, “Self-learning IP traffic classification based on statistical flow characteristics,” in Proc. of 6th Int. Workshop on Passive and Active Measurement, pp.325-328, 2005.
  5. Sebastian Zander, Thuy Nguyen, and Grenville Armitage, “Automated traffic classification and application identification using machine learning,” in Proc. of IEEE Conf. on Local Computer Networks, pp.250-257, 2005.
  6. Matthew Roughan, Subrabrata Sen, Oliver Spatscheck, and Nick Duffield, “Class-of-service mapping for QoS: a statistical signature-based approach to IP traffic classification,” in Proc. of 4th ACM SigComm Conf. on Internet Measurement, pp.135-148, 2004.
  7. Andrew Moore and Dennis Zuev, “Internet traffic classification using Bayesian analysis techniques,” in Proc. of ACM Sigmetrics Int. Conf. on Measurement and Modeling of Computer Systems, pp.50-60, 2005.
  8. Tom Auld, Andrew Moore, and Stephen Gull, “Bayesian neural networks for Internet traffic classification,” IEEE Trans. on Neural Networks, vol.18, no.1, pp.223-239, Jan. 2007 https://doi.org/10.1109/TNN.2006.883010
  9. Thomas Karagiannis, Konstantina Papagiannaki, and Michalis Faloutsos, “BLINC: multilevel traffic classification in the dark,” ACM Sigcomm Computer Communications Review, vol.35, no.10, pp.229-240, 2005. https://doi.org/10.1145/1090191.1080119
  10. Laurent Bernaille, Renata Teixeira, Ismael Akodkenou, Augustin Soule, and Kave Salamatian, “Traffic classification on the fly,” ACM Sigcomm Computer Communications Review, vol.36, no.4, pp.23-26, Apr. 2006. https://doi.org/10.1145/1129582.1129589
  11. Nigel Williams, Sebastian Zander, and Grenville Armitage, “A preliminary performance comparison of five machine learning algorithms for practical IP traffic flow classification,” ACM Sigcomm Computer Communications Review, vol.36, no.10, pp.5-16, Oct. 2006.
  12. Jeffrey Erman, Martin Arlitt, and Anirban Mahanti, “Traffic classification using clustering algorithms,” in Proc. of ACM Sigcomm Workshop on Mining Network Data, pp.281-286, 2006.
  13. Jeffrey Erman, Anirban Mahanti, Martin Arlitt, and Carey Williamson, “Identifying and discriminating between web and peer-to-peer traffic in the network core,” in Proc. of 16th Int. Conf. on World Wide Web, pp.883-892, 2007.
  14. Liu Yingqiu, Li Wei, and Li Yunchun, “Network traffic classification using k-means clustering,” in Proc. of 2nd Int. Multisymposium on Computer and Computational Sciences, pp.360-365, 2007
  15. Manuel Crotti, Francesco Gringoli, Paolo Pelosato, and Luca Salgarelli, “A statistical approach to IP-level classification of network traffic,” in Proc. of IEEE ICC 2006, pp.170-176, 2006.
  16. Manuel Crotti, Maurizio Dusi, Francesco Gringoli, and Luca Salgarelli, “Traffic classification through simple statistical fingerprinting,” ACM Sigcomm Computer Communications Review, vol.37, no.1, pp.7-16, Jan. 2007.
  17. Hajime Inoue, Dana Jansens, Abdulrahman Hijazi, and Anil Somayaji, “NetADHICT: a tool for understanding network traffic,” in Proc. of Usenix Large Installation System Administration Conf., pp. 39-47, 2007.
  18. Jin Cao, Aiyou Chen, Indra Widjaja, and Nengfeng Zhou, “Online identification of applications using statistical behavior analysis,” in Proc. of IEEE Globecom 2008, pp.1-6, 2008.
  19. Charles Wright, Fabian Monrose, and Gerald Masson, “On inferring application protocol behaviors in encrypted network traffic,” J. of Machine Learning Research, vol.7, no.12, pp.2745-2769, Dec. 2006.
  20. Alberto Dainotti, Walter de Donato, Antonio Pescape, and Pierluigi Salvo Rossi, “Classification of network traffic via packet-level hidden Markov models,” in Proc. of IEEE Globecom 2008, pp.1-5, 2008.
  21. Teuvo Kohonen, “Self-Organizing Map,” Springer Series in Information Sciences, vol.30, Springer Berlin, 1995.

Cited by

  1. Reversible Watermark Using an Accurate Predictor and Sorter Based on Payload Balancing vol.34, pp.3, 2012, https://doi.org/10.4218/etrij.12.0111.0075
  2. Plagiarism Detection among Source Codes using Adaptive Methods vol.6, pp.6, 2009, https://doi.org/10.3837/tiis.2012.06.008
  3. A review on machine learning-based approaches for Internet traffic classification vol.75, pp.11, 2009, https://doi.org/10.1007/s12243-020-00770-7
  4. Augmenting DiffServ operations with dynamically learned classes of services vol.202, pp.None, 2009, https://doi.org/10.1016/j.comnet.2021.108624