DOI QR코드

DOI QR Code

An Improved Machine Learning-Based Short Message Service Spam Detection System

  • Odukoya Oluwatoyin (Department of Computer Science and Engineering, Obafemi Awolowo University) ;
  • Akinyemi Bodunde (Department of Computer Science and Engineering, Obafemi Awolowo University) ;
  • Gooding Titus (Department of Computer Science and Engineering, Obafemi Awolowo University) ;
  • Aderounmu Ganiyu (Department of Computer Science and Engineering, Obafemi Awolowo University)
  • Received : 2024.10.05
  • Published : 2024.10.30

Abstract

The use of Short Message Services (SMS) as a mechanism of communication has resulted to loss of sensitive information such as credit card details, medical information and bank account details (user name and password). Several Machine learning-based approaches have been proposed to address this problem, but they are still unable to detect modified SMS spam messages more accurately. Thus, in this research, a stack- ensemble of four machine learning algorithms consisting of Random Forest (RF), Logistic Regression (LR), Multilayer Perceptron (MLP), and Support Vector Machine (SVM), were employed to detect more accurately SMS spams. The simulation was carried out using Python Scikit- learn tools. The performance evaluation of the proposed model was carried out by benchmarking it with an existing model. The evaluation results showed that the proposed model has an increase of 3.03% of accuracy, 8.94% of Recall, 2.17% of F-measure; and a decrease of 4.55% of Precision over the existing model. In conclusion, the ensemble method performed better than any individual algorithms and can be adopted by the Network service providers for better Quality of Service.

Keywords

Acknowledgement

This Research was funded by the TETFund Research Fund" and Africa Centre of Excellence OAK-Park.

References

  1. A. Al-Hassana, E. M. El-Alfyb, "Dendritic Cell Algorithm for Mobile Phone Spam Filtering," 6th International Conference on Ambient Systems, Networks and Technologies, Procedia Computer Science, vol. 52, pp. 244 - 251, 2015.
  2. Baldwin, "350,000 different types of spam SMS messages were targeted at mobile users in 2012," Computer weekly publication [online] February 2013. Available: https://www.computerweekly.com/news/2240178681/350000-different-types-of-spam-SMS-messages-were-targeted-atmobile-users-in-2012
  3. D.N. Sohn, J.T. Lee, K.S. Han, and H.C. Rim, "Content-based mobile spam classification using stylistically motivated features". Pattern Recognition Letters, vol. 33, no. 3, pp.364-369, 2012.
  4. Suleiman and G. Al-Naymat, "SMS Spam Detection Using H2O framework." Procedia Computer Science, vol. 113, pp 154-161, 2017.
  5. H. Sajedi, G. Z. Parast, and F. Akbari, " SMS Spam Filtering Using Machine Learning Techniques: A Survey" . Machine Learning Research. Vol. 1, No. 1, pp. 1-4, 2016.
  6. N. Choudhary and A.K.Jain. "Towards Filtering of SMS Spam Messages Using Machine Learning Based Technique". In: Singh D., Raman B., Luhach A., Lingras P. (eds) Advanced Informatics for Computing Research. Communications in Computer and Information Science, Springer, Singapore, vol. 712, pp 18-30, 2017.
  7. L. N. Lota and B M Mainul Hossain ,"A Systematic Literature Review on SMS Spam Detection Techniques", International Journal of Information Technology and Computer Science (IJITCS), vol.9, no.7, pp.42-50, 2017.
  8. T.H. Pham and P. Le-Hong, "Content-based Approach for Vietna- mese Spam SMS Filtering". In proceedings of 2016 International Conference on Asian Language Processing (IALP), Tainan, pp. 41-44, 2016.
  9. G.V. Cormack, J.M. Gomez Hidalg, and E.P. Sanz, "Feature Engineering for mobile (SMS) spam filtering," Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23- 27, 2007, Amsterdam, pp 871-872, 2007.
  10. N. Chaudhari, P. Jayvala, and P. Vinitashah," Survey on Spam SMS filtering using Data mining Techniques," International Journal of Advanced Research in Computer and Communication Engineering, Vol. 5, Issue 11, 2016
  11. I. Ahmed, D. Guan and T. C. Chung, " SMS Classification Based on Naive Bayes Classifier and Apriori Algorithm Frequent Itemset," International Journal of Machine Learning and Computing, Vol. 4, No. 2, pp 184-187, 2014
  12. K. Yadav, P. Kumaraguru, A. Goyal, A. Gupta and V. Naik, "SMS Assassin: Crowdsourcing Driven Mobile-based System for SMS Spam Filtering," in Proceedings of the 12th Workshop on Mobile Computing Systems and Applications, pp 1-6, 2011.
  13. J. Brownlee, "Machine Learning Mastery with Python: Understand Your Data, Create Accurate Models and Work Projects End-To-End., Edition: v1.5, pp 1-24, 2016,
  14. H. Trevor, T. Robert, J. H Friedman and F. James, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction," In proceedings of the Mathematical Intelligencer, Vol. 27, No 2, pp 83-85, 2004.
  15. T. A. Almeida and J. M Gomez Hidalgo, "SMS Spam Collection Data Set- UCI Machine Learning Repository," Available: https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection. 2011
  16. S. Guido and A. C. Muller, "Introduction to machine learning with Python: a guide for data scientists. O'Reilly Media, Inc., 2016
  17. H. Shirani-Mehr, "SMS Spam Detection using Machine Learning Approach," CS229 Project 2013, Stanford University, USA, pp. 1-4, 2013
  18. S. Schrauwen, "Machine learning approach to sentiment analysis using the Dutch Netlog Corpus." Computational Linguistic and Psycholingistics Research Center, pp1-78, 2010
  19. K. Shin, D. Fernandes and S. Miyazaki. "Consistency Measure for feature Selection: A formal Definition, Relative Sensitivity Comparison and a fast Algorithm". In Proceeding of Twenty -Second International Joint Conference on Artificial Intelligence, pp 1491-1497, 2011