DOI QR코드

DOI QR Code

Practical evaluation of encrypted traffic classification based on a combined method of entropy estimation and neural networks

  • Zhou, Kun (School of Computer Science and Engineering, University of Electronic Science and Technology of China) ;
  • Wang, Wenyong (School of Computer Science and Engineering, University of Electronic Science and Technology of China) ;
  • Wu, Chenhuang (School of Computer Science and Engineering, University of Electronic Science and Technology of China) ;
  • Hu, Teng (School of Computer Science and Engineering, University of Electronic Science and Technology of China)
  • Received : 2019.04.19
  • Accepted : 2019.10.01
  • Published : 2020.06.08

Abstract

Encrypted traffic classification plays a vital role in cybersecurity as network traffic encryption becomes prevalent. First, we briefly introduce three traffic encryption mechanisms: IPsec, SSL/TLS, and SRTP. After evaluating the performances of support vector machine, random forest, naïve Bayes, and logistic regression for traffic classification, we propose the combined approach of entropy estimation and artificial neural networks. First, network traffic is classified as encrypted or plaintext with entropy estimation. Encrypted traffic is then further classified using neural networks. We propose using traffic packet's sizes, packet's inter-arrival time, and direction as the neural network's input. Our combined approach was evaluated with the dataset obtained from the Canadian Institute for Cybersecurity. Results show an improved precision (from 1 to 7 percentage points), and some application classification metrics improved nearly by 30 percentage points.

Keywords

References

  1. C. V. Wright et al., Language identification of encrypted VoIP traffic: Alejandra y Roberto or Alice and Bob?, in Proc. USENIX Security Symp. USENIX Security Symp., Moston, MA, USA, Aug. 2007, Article no. 4.
  2. C. V. Wright et al., Spot me if you can: Uncovering spoken phrases in encrypted VoIP conversations, in IEEE Symp. Security Privacy, Oakland, CA, USA, May 2008, pp. 35-49, https://doi.org/10.1109/SP.2008.21.
  3. B. Anderson, S. Paul, and D. McGrew, Deciphering malware's use of TLS without decryption, arXiv, 2016. https://arxiv.org/abs/1607.01639.
  4. C. V. Wright, F. Monrose, and G. M. Masson, On inferring application protocol behaviors in encrypted network traffic, J. Mach. Learning Research 7 (2006), 2745-2769.
  5. J. Sherry et al., BlindBox: Deep Packet Inspection over Encrypted Traffic, in Proc. ACM Conf. Special Interest Group Data Commun., London, UK, Aug. 2015, pp. 213-226.
  6. Q. Sun et al., Statistical identification of encrypted web browsing traffic, in Proc.IEEE Symp. Security Privacy Berkeley, CA, USA, May 2002, pp. 1-12.
  7. M. Liberatore and B. N. Levine, Inferring the source of encrypted HTTP connections, in Porc. ACM Confe. Comput. Commun. Security, Alexandria, VA, USA, 2006, pp. 255-263.
  8. R. Schuster, V. Shmatikov, and E. Tromer, Beauty and the burst: Remote identification of encrypted video streams, in Proc.USENIX Security Symp., Vancuver, Canada, 2017, pp. 1357-1374.
  9. D. X. Son, D. Wagne, and X. Tian, Timing analysis of keystrokes and timing attacks on SSH, in Proc. USENIX Security Symp., Washington, DC, USA, Aug. 2001, Article no. 25.
  10. P. Dorfinger, G. Panholzer, and W. John, Entropy estimation for real-time encrypted traffic identification, in Proc. Int. Workshop Traffic Monitoring Analysis, Vienna, Austria, 2011, pp. 164-171. https://doi.org/10.1007/978-3-642-20305-3_14.
  11. A. Moore, D. Zuev, and M. Crogan, Discriminators for use in flow-based classification, Department of Computer Science Research Reports; RR-05-13, 2005.
  12. P. Velan et al., A survey of methods for encrypted traffic classification and analysis, Int. J. Netw. Manag. 25 (2015), 1-24. https://doi.org/10.1002/nem.1882
  13. D. J. Arndt and A. N. Zincir-Heywood, A comparison of three machine learning techniques for encrypted network traffic analysis, in IEEE Symp. Computat. Intell. Security Defense Applicat. Paris, France, Apr. 2011, https://doi.org/10.1109/CISDA.2011.5945941.
  14. T. T. Nguyen and G. Armitage, A survey of techniques for Internet traffic classification using machine learning, IEEE Commun. Surveys Tutor. 10 (2008), 56-76. https://doi.org/10.1109/SURV.2008.080406.
  15. J. Zhang et al., Network traffic classification using correlation information, IEEE Trans. Parallel Distrib. Syst. 24 (2012), 104-117. https://doi.org/10.1109/TPDS.2012.98.
  16. A. Finamore et al., KISS: Stochastic packet inspection classifier for UDP traffic, IEEE/ACM Trans. Netw. 18 (2010), 1505-1515. https://doi.org/10.1109/TNET.2010.2044046.
  17. B. Anderson, S. Paul, and D. McGrew, Deciphering malware's use of TLS, [without decryption] J. Comput. Virology Hacking Techn. 14 (2018), 195-211. https://doi.org/10.1007/s11416-017-0306-6
  18. J. A. Bonachela, H. Hinrichsen, and M. A. Munoz, Entropy estimates of small data sets, J. Phys. A: Math. Theor. 41 (2008), 1-9.
  19. J. Goubault-Larrecq and J. Olivain, Detecting Subverted Cryptographic Protocols by Entropy Checking, Research Report LSV-06-13, 2006, INRIA Futurs projet SECSI.
  20. L. Paninski, Estimation of entropy and mutual information, Neural Computation, Neural Comput. 15 (2003), 1191-1253. https://doi.org/10.1162/089976603321780272
  21. M. Lotfollahi et al., Deep Packet: A novel approach for encrypted traffic classification using deep learning, Soft Comput. (2019), 1-14, https://doi.org/10.1007/s00500-019-04030-2.
  22. G. D. Gil et al., Characterization of encrypted and VPN traffic using time-related features, in Proc. Int. Conf. Inform. Syst. Security Privacy, 2016 pp. 407-414, https://doi.org/10.5220/0005740704070414.
  23. B. Yamansavascilar et al., Application identification via network traffic classification, in Proc. Int. Conf. Comput., Netw. Commun., Santa Clara, CA, USA, Jan. 2017, https://doi.org/10.1109/ICCNC.2017.7876241.
  24. B.-H. Asa et al., Support vector clustering, J. Mach. Learn. Research 2 (2001), 125-137.
  25. T. K. Ho et al., The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell. 20 (1998), 832-844. https://doi.org/10.1109/34.709601
  26. I. Rish, An empirical study of the naive Bayes classifier, IJCAI Workshop Empirical Methods AI 3 (2001), 41-46.
  27. G. Aceto et al., Multi-classification approaches for classifying mobile app traffic, J. Netw. Comput. Applicat. 103 (2018), 131-145. https://doi.org/10.1016/j.jnca.2017.11.007
  28. G. Aceto et al., Mobile encrypted traffic classification using deep learning: Experimental evaluation, lessons learned, and challenges, IEEE Trans. Netw. Service Manag. 16 (2019), 445-458. https://doi.org/10.1109/TNSM.2019.2899085
  29. G. Aceto et al., Anonymity services Tor, I2P, JonDonym: Classifying in the Dark (Web), IEEE Trans. Dependable Secure Comput. (2018), Early Access.
  30. V. F. Taylor et al., Appscanner: Automatic fingerprinting of smartphone apps from encrypted network traffic, in Proc. IEEE Eur. Symp. Security Privacy (EuroS&P), Saarbrucken, Germany, Mar. 2016, pp. 439-454.
  31. S. Rezaei and X. Liu, Deep learning for encrypted traffic classification: An overview, IEEE Commun. mag. 57(2019), 76-81.
  32. M. Lotfollahi et al., Deep packet: A novel approach for encrypted traffic classification using deep learning, Springer, Berlin Heidelberg, Soft Computing, 2019, pp. 1-14.
  33. G. Aceto et al., Mobile encrypted traffic classification using deep learning, in Proc. Netw. Traffic Measurement Analysis Conf., Vienna, Austria, June 2018, pp. 1-8.

Cited by

  1. Classification of FPGA Based Network Traffic Using Machine Learning vol.1964, pp.6, 2021, https://doi.org/10.1088/1742-6596/1964/6/062008
  2. Real-time photoplethysmographic heart rate measurement using deep neural network filters vol.43, pp.5, 2020, https://doi.org/10.4218/etrij.2020-0394