UFKLDA: An unsupervised feature extraction algorithm for anomaly detection under cloud environment

  • Wang, GuiPing (College of Information Science and Engineering, Chongqing Jiaotong University) ;
  • Yang, JianXi (College of Information Science and Engineering, Chongqing Jiaotong University) ;
  • Li, Ren (College of Information Science and Engineering, Chongqing Jiaotong University)
  • Received : 2018.08.24
  • Accepted : 2019.01.05
  • Published : 2019.10.01


In a cloud environment, performance degradation, or even downtime, of virtual machines (VMs) usually appears gradually along with anomalous states of VMs. To better characterize the state of a VM, all possible performance metrics are collected. For such high-dimensional datasets, this article proposes a feature extraction algorithm based on unsupervised fuzzy linear discriminant analysis with kernel (UFKLDA). By introducing the kernel method, UFKLDA can not only effectively deal with non-Gaussian datasets but also implement nonlinear feature extraction. Two sets of experiments were undertaken. In discriminability experiments, this article introduces quantitative criteria to measure discriminability among all classes of samples. The results show that UFKLDA improves discriminability compared with other popular feature extraction algorithms. In detection accuracy experiments, this article computes accuracy measures of an anomaly detection algorithm (i.e., C-SVM) on the original performance metrics and extracted features. The results show that anomaly detection with features extracted by UFKLDA improves the accuracy of detection in terms of sensitivity and specificity.



Supported by : National Natural Science Foundation of China, Chongqing Municipal Education Commission


  1. A. Avizienis, et al., Basic concepts and taxonomy of dependable and secure computing, IEEE Trans. Dependable Secure Comput. 1 (2004), no. 1, 11-33.
  2. W. Abderrahim and Z. Choukair, The three-dimensional model for dependability integration in cloud computing, Ann. Telecommun. 72 (2017), no. 5-6, 371-384.
  3. H. B. Mi et al., Toward fine-grained, unsupervised, scalable performance diagnosis for production cloud computing systems, IEEE Trans. Parallel Distrib. Syst. 24 (2013), no. 6, 1245-1255.
  4. A. N. Harutyunyan et al., Abnormality analysis of streamed log data, in Proc. IEEE/IFIP Netw. Oper. Manag. Symp.: Manag. Softw. Defined World, Krakow, Poland, May 2014, pp. 1-7.
  5. S. Alarifi and S. Wolthusen, Anomaly detection for ephemeral cloud IaaS virtual machines, in Proc. Int. Conf. Netw. Syst. (NSS), Madrid, Spain, June 3-4, 2013, pp. 321-335.
  6. K. Adamova et al., Network anomaly detection in the cloud: The challenges of virtual service migration, in Proc. IEEE Int. Conf. Commun. (ICC), Sydney, Australia, June 2014, pp. 3770-3775.
  7. B. L. Dalmazo et al., Expedite feature extraction for enhanced cloud anomaly detection, in Proc. IEEE/IFIP Netw. Oper. Manag. Symp. (NOMS), Istanbul, Turkey, Apr. 2016, pp. 1215-1220.
  8. Q. Guan, C. C. Chiu, and S. Fu, , in CDA: A cloud dependability analysis framework for characterizing system dependability in cloud computing Proc. IEEE Pacific Rim Int. Symp. Dependable Comput. (PRDC), Niigata, Japan, Nov. 2012, pp. 11-20.
  9. Q. Guan and S. Fu, Auto-AID: A data mining framework for autonomic anomaly identification in networked computer systems, in Proc. IEEE Int. Performance Comput. Commun. Conf. (IPCCC), Albuquerque, NM, USA, Dec. 2010, pp. 73-80.
  10. C.-H. Li, B.-C. Kuo, and C.-T. Lin, LDA-based clustering algorithm and its application to an unsupervised feature extraction, IEEE Trans. Fuzzy Syst. 19 (2011), no. 1, 152-163.
  11. R. Kumar, S. Vijayakumar, and S. A. Ahamed, A pragmatic approach to predict hardware failures in storage systems using MPP database and big data technologies, in Proc. IEEE Int. Adv. Comput. Conf., Gurgaon, India, Feb. 2014, pp. 779-788.
  12. F. Langner and A. Andrzejak, Detecting software aging in a cloud computing framework by comparing development versions, in Proc. IFIP/IEEE Int. Symp. Integr. Netw. Manage., Ghent, Belgium, May 2013, pp. 896-899.
  13. S. Kikuchi and K. Hiraishi, Improving reliability in management of cloud computing infrastructure by formal methods, in Proc. IEEE/IFIP Netw. Oper. Manag. Symp.: Manag. Softw. Defined World, Krakow, Poland, May 2014, pp. 1-7.
  14. S. N. Brohi et al., Identifying and analyzing security threats to virtualized cloud computing infrastructures, in Proc. Int. Conf. Cloud Comput. Technol., Applicat. Manag., Dubai, United Arab Emirates, Dec. 2012, pp. 151-155.
  15. Z. L. Lan, Z. M. Zheng, and Y. W. Li, Toward automated anomaly identification in large-scale systems, IEEE Trans. Parallel Distrib. Syst. 21 (2010), no. 2, 174-187.
  16. J. J. Davis and A. J. Clark, Data preprocessing for anomaly based network intrusion detection: A review, Comp. Secur. 30 (2011), no. 6-7, 353-375.
  17. D. Smith, Q. Guan, and S. Fu, An anomaly detection framework for autonomic management of compute cloud systems, in Proc. Annu. IEEE Int. Comput. Softw. Applicat. Conf. Workshops, Seoul, Rep. of Korea, July 2010, pp. 376-381.
  18. C. H. Zhao, Y. L. Wang, and F. Mei, Kernel ICA feature extraction for anomaly detection in hyperspectral imagery, Chin. J. Electron. 21 (2012), no. 2, 265-269.
  19. X. S. Gan et al., Anomaly intrusion detection based on PLS feature extraction and core vector machine, Knowl.-Based Syst. 40 (2013), 1-6.
  20. B. Zamani, A. Akbari, and B. Nasersharif, Evolutionary combination of kernels for nonlinear feature transformation, Inf. Sci. 274 (2014), 95-107.
  21. D. Lunga et al., Manifold-learning-based feature extraction for classification of hyperspectral data, IEEE Signal Process. Mag. 31 (2014), no. 1, 55-66.
  22. S.-H. Lee and J. S. Lim, Parkinson's disease classification using gait characteristics and wavelet-based feature extraction, Expert Syst. Appl. 39 (2012), no. 8, 7338-7344.
  23. F. Zhang, Nonlinear feature extraction and dimension reduction by polygonal principal curves, Int. J. Pattern Recognit. Artif. Intell. 20 (2006), no. 1, 63-78.
  24. J. G. Moreno-Torres et al., Repairing fractures between data using genetic programming-based feature extraction: a case study in cancer diagnosis, Inf. Sci. 222 (2013), 805-823.
  25. Y. K. Kwon and B. R. Moon, Nonlinear feature extraction using a neuro genetic hybrid, in Proc. Int. Conf. Genetic Evolutionary Computat. Conf. (GECCO), Washington, DC, USA, June 25-29, 2005, pp. 2089-2096.
  26. R. A. Fisher, The use of multiple measurements in taxonomic problems, Ann. Eugen. 7 (1936), no. 2, 179-188.
  27. S. Mika et al., Fisher discriminant analysis with kernels, in Proc. 1999 IEEE Signal Process. Soc. Workshop, Neural Netw. Signal Process., Madison, WI, USA, Aug. 1999, pp. 41-48.
  28. S. T. John and C. Nello, Kernel methods for pattern analysis, Cambridge University Press, Cambridge, UK, 2004.
  29. A. Narasimhamurthy and L. I. Kuncheva, A Framework for generating data to simulate changing environments, in Proc. IASTED Int. Multi-Conf.: Artif. Intell. Applicat. (AIAP), Innsbruck, Austria, Feb. 2007, pp. 384-389.
  30. L. I.Kuncheva, Artificial data sets, 2007, available at
  31. C. Cortes and V. N. Vapnik, Support-vector networks, Mach. Learn. 20 (1995), no. 3, 273-279.
  32. C. C. Chang and C. J. Lin, LIBSVM: a library for support vector machines, ACM Trans. Intell. Syst. Technol. 2 (2011), no. 3, article 27.

Cited by

  1. Application of Feature Extraction Algorithm in the Construction of Interactive English Chinese Translation Mode vol.2021, 2019,