DOI QR코드

DOI QR Code

A Supervised Feature Selection Method for Malicious Intrusions Detection in IoT Based on Genetic Algorithm

  • Saman Iftikhar (Faculty of Computer Studies, Arab Open University) ;
  • Daniah Al-Madani (Faculty of Computer Studies, Arab Open University) ;
  • Saima Abdullah (Department of Computer Science, The Islamia University of Bahawalpur) ;
  • Ammar Saeed (Department of Computer Science, COMSATS University Islamabad, Wah Campus) ;
  • Kiran Fatima (TAFE - New South Wales)
  • Received : 2023.03.05
  • Published : 2023.03.30

Abstract

Machine learning methods diversely applied to the Internet of Things (IoT) field have been successful due to the enhancement of computer processing power. They offer an effective way of detecting malicious intrusions in IoT because of their high-level feature extraction capabilities. In this paper, we proposed a novel feature selection method for malicious intrusion detection in IoT by using an evolutionary technique - Genetic Algorithm (GA) and Machine Learning (ML) algorithms. The proposed model is performing the classification of BoT-IoT dataset to evaluate its quality through the training and testing with classifiers. The data is reduced and several preprocessing steps are applied such as: unnecessary information removal, null value checking, label encoding, standard scaling and data balancing. GA has applied over the preprocessed data, to select the most relevant features and maintain model optimization. The selected features from GA are given to ML classifiers such as Logistic Regression (LR) and Support Vector Machine (SVM) and the results are evaluated using performance evaluation measures including recall, precision and f1-score. Two sets of experiments are conducted, and it is concluded that hyperparameter tuning has a significant consequence on the performance of both ML classifiers. Overall, SVM still remained the best model in both cases and overall results increased.

Keywords

Acknowledgement

The authors would like to thank Arab Open University, Saudi Arabia for supporting this study. Dr. Saman Iftikhar is the corresponding author.

References

  1. R. M. Alhajri, A. B. Faisal and R. Zagrouba. "Survey for anomaly detection of IoT botnets using machine learning auto-encoders," Int J Appl Eng Res, vol. 14, no. 10, pp. 2417, 2019.
  2. A. L. Buczak, and E. Guven, "A survey of data mining and machine learning methods for cyber security intrusion detection," IEEE Communications Surveys & Tutorials, vol. 18, no. 2, pp. 1153-1176, 2015.
  3. M. Abdullah, A. Balamash, A. Al-Shannaq, and S. Almabdy. (2018). Enhanced Intrusion Detection System using Feature Selection Method and Ensemble Learning Algorithms. International Journal of Computer Science and Information Security. 16. 48-55.
  4. S. Egea, A. R. Manez, B. Carro, A. Sanchez-Esguevillas, and J. Lloret, "Intelligent IoT traffic classification using novel search strategy for fast based-correlation feature selection in industrial environments," IEEE Internet of Things Journal, vol. 5, no. 3, pp. 1616-1624, 2018. https://doi.org/10.1109/JIOT.2017.2787959
  5. H. Zhang, G. Lu, M. T. Qassrawi, Y. Zhang, and X. Yu, "Feature selection for optimizing traffic classification," Computer Communications, vol. 35, no. 12, pp. 1457-1471, 2012.
  6. S. Su, et al. "A correlation-change based feature selection method for IoT equipment anomaly detection," Applied Sciences, vol. 9, no. 3, pp. 437, 2019.
  7. W. Alhakami, et al. "Network anomaly intrusion detection using a nonparametric bayesian approach and feature selection," IEEE Access, vol. 7, pp. 52181-5219, 2019.
  8. M. Shafiq et al. "Effective feature selection for 5G IM applications traffic classification," Mobile Information Systems, 2017.
  9. A. Saxena, S. Sinha, P. Shukla, "General study of intrusion detection system and survey of agent based intrusion detection system," Proc. 2017 Int. Conf. Comput. Commun. Autom., pp. 421-471, 2017.
  10. L. H. and M. A. Jabbar, "Role of machine learning in intrusion detection system: Review," Proc. 2018 Second Int. Conf. Electron. Commun. Aerosp. Technol., pp. 925-929, 2018.
  11. Y. Y. A. and M. M. Min, "An analysis of random forest algorithm based network intrusion detection system," Proc. 2017 18th IEEE/ACIS Int. Conf. Softw. Eng. Artif. Intell. Netw. Parallel/Distributed Comput., pp. 127-132, 2017.
  12. T. B. Alhijaj, S. M. Hameed, and B. A. Attea, "A Decision TreeAware Genetic Algorithm for Botnet Detection," vol. 62, no. 7, pp. 2454-2462, 2021.
  13. Z. Liu and Y. Shi, "A Hybrid IDS Using GA - Based Feature Selection Method and Random Forest," vol. 12, no. 2, 2022.
  14. J. Yin, C., Awlla, A. H., Yin, Z., & Wang, "Botnet detection based on genetic neural network," Int. J. Secur. Its Appl., vol. 9(11), pp. 97- 104, 2015. https://doi.org/10.14257/ijsia.2015.9.11.10
  15. J. Moustafa, N., & Slay, "The evaluation of Network Anomaly Detection Systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set," Inf. Secur. J. A Glob. Perspect., vol. 25(1-3), pp. 18-31, 2016. https://doi.org/10.1080/19393555.2015.1125974
  16. E. A. Alejandre, F. V., Cortes, N. C., & Anaya, "Feature selection to detect botnets using machine learning algorithms," Int. Conf. Electron. Commun. Comput., 2017.
  17. M. A. Alauthaman, M., Aslam, N., Zhang, L., Alasem, R., & Hossain, "A P2P Botnet detection scheme based on decision tree and adaptive multilayer neural networks.," Neural Comput. Appl., vol. 29(11), pp. 991-1004, 2017. https://doi.org/10.1007/s00521-016-2564-5
  18. Y. Chang, W. LLi,and Z. Yang, "Network intrusion detection based on random forest and support vector machine," Proc. 2017 IEEE Int. Conf. Comput. Sci. Eng. IEEE Int. Conf. Embed. Ubiquitous Comput., pp. 635-638, 2017.
  19. N. Koroniotis, N. Moustafa, E. Sitnikova, B. Turnbull, "Towards the Development of Realistic Botnet Dataset in the Internet of Things for Network Forensic Analytics: Bot-IoT Dataset", https://arxiv.org/abs/1811.00701, 2018.
  20. B. B. Jia and M. L. Zhang, 2021. Multi-Dimensional Classification via Decomposed Label Encoding. IEEE Transactions on Knowledge and Data Engineering.
  21. M. M. Ahsan, M. A. Mahmud, P. K. Saha, K. D. Gupta and Z. Siddique, 2021. Effect of data scaling methods on machine learning algorithms and model performance. Technologies, 9(3), p.52.
  22. J. Cai, J. Luo, S. Wang and S. Yang, 2018. Feature selection in machine learning: A new perspective. Neurocomputing, 300, pp.70-79. https://doi.org/10.1016/j.neucom.2017.11.077
  23. S. B. Kotsiantis, I. D. Zaharakis and P. E. Pintelas, 2006. Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26(3), pp.159-190. https://doi.org/10.1007/s10462-007-9052-3
  24. I. H. Sarker et al. "Intrudtree: A machine learning based cyber security intrusion detection model," Symmetry vol. 12, no. 5, pp. 754, 2020.
  25. R. Ahmad and I. Alsmadi, I. "Machine learning approaches to IoT security: A systematic literature review," Internet of Things, 100365, 2021.
  26. I. Ullah, and Q. H. Mahmoud, "A Technique for Generating a Botnet Dataset for Anomalous Activity Detection in IoT Networks," 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC) (pp. 134-140, 2020.
  27. I. Sharafaldin et al., "Toward generating a new intrusion detection dataset and intrusion traffic characterization," ICISSp, pp. 108-116, 2018.
  28. S. Liu, X. Hao, and X. Chen. A semi-supervised dynamic ensemble algorithm for IoT anomaly detection. 2020 International Conferences on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) and IEEE Congress on Cybermatics (Cybermatics), 264-269.