Enhancing Malware Detection with TabNetClassifier: A SMOTE-based Approach

  • Rahimov Faridun (Dept. of Computer Science, Hanyang University ) ;
  • Eul Gyu Im (Dept. of Computer Science, Hanyang University)
  • Published : 2024.05.23

Abstract

Malware detection has become increasingly critical with the proliferation of end devices. To improve detection rates and efficiency, the research focus in malware detection has shifted towards leveraging machine learning and deep learning approaches. This shift is particularly relevant in the context of the widespread adoption of end devices, including smartphones, Internet of Things devices, and personal computers. Machine learning techniques are employed to train models on extensive datasets and evaluate various features, while deep learning algorithms have been extensively utilized to achieve these objectives. In this research, we introduce TabNet, a novel architecture designed for deep learning with tabular data, specifically tailored for enhancing malware detection techniques. Furthermore, the Synthetic Minority Over-Sampling Technique is utilized in this work to counteract the challenges posed by imbalanced datasets in machine learning. SMOTE efficiently balances class distributions, thereby improving model performance and classification accuracy. Our study demonstrates that SMOTE can effectively neutralize class imbalance bias, resulting in more dependable and precise machine learning models.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2022R1A4A1032361).

References

  1. Raghuraman, Chandni, et al. "Static and dynamic malware analysis using machine learning." First International Conference on Sustainable Technologies for Computational Intelligence: Proceedings of ICTSCI 2019. Springer Singapore, 2020.
  2. Aslan, Omer, and Abdullah Asim Yilmaz. "A new malware classification framework based on deep learning algorithms." Ieee Access 9 (2021): 87936-87951.
  3. Olowoyo, Olufikayo, and Pius Owolawi. "Malware classification using deep learning technique." 2020 2nd International Multidisciplinary Information Technology and Engineering Conference (IMITEC). IEEE, 2020.
  4. Li, Chen, and Junjun Zheng. "API call-based malware classification using recurrent neural networks." Journal of Cyber Security and Mobility 10.3 (2021): 617-640.
  5. Rathore, Hemant, et al. "Malware detection using machine learning and deep learning." Big Data Analytics: 6th International Conference, BDA 2018, Warangal, India, December 18-21, 2018, Proceedings 6. Springer International Publishing, 2018.
  6. Catak, Ferhat Ozgur, et al. "Deep learning based Sequential model for malware analysis using Windows exe API Calls." PeerJ Computer Science 6 (2020): e285.
  7. Arik, Sercan O., and Tomas Pfister. "Tabnet: Attentive interpretable tabular learning. arXiv 2019." arXiv preprint arXiv:1908.07442 (1908).
  8. Soltanzadeh, Paria, and Mahdi Hashemzadeh. "RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problem." Information Sciences 542 (2021): 92-111.
  9. https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing/tree/master/Chapter03.