DOI QR코드

DOI QR Code

Resume Classification System using Natural Language Processing & Machine Learning Techniques

  • Irfan Ali (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University) ;
  • Nimra (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University) ;
  • Ghulam Mujtaba (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University) ;
  • Zahid Hussain Khand (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University) ;
  • Zafar Ali (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University) ;
  • Sajid Khan (Center of Excellence for Robotics, Artificial Intelligence, and Blockchain, Department of Computer Science, Sukkur IBA University)
  • Received : 2024.07.05
  • Published : 2024.07.30

Abstract

The selection and recommendation of a suitable job applicant from the pool of thousands of applications are often daunting jobs for an employer. The recommendation and selection process significantly increases the workload of the concerned department of an employer. Thus, Resume Classification System using the Natural Language Processing (NLP) and Machine Learning (ML) techniques could automate this tedious process and ease the job of an employer. Moreover, the automation of this process can significantly expedite and transparent the applicants' selection process with mere human involvement. Nevertheless, various Machine Learning approaches have been proposed to develop Resume Classification Systems. However, this study presents an automated NLP and ML-based system that classifies the Resumes according to job categories with performance guarantees. This study employs various ML algorithms and NLP techniques to measure the accuracy of Resume Classification Systems and proposes a solution with better accuracy and reliability in different settings. To demonstrate the significance of NLP & ML techniques for processing & classification of Resumes, the extracted features were tested on nine machine learning models Support Vector Machine - SVM (Linear, SGD, SVC & NuSVC), Naïve Bayes (Bernoulli, Multinomial & Gaussian), K-Nearest Neighbor (KNN) and Logistic Regression (LR). The Term-Frequency Inverse Document (TF-IDF) feature representation scheme proven suitable for Resume Classification Task. The developed models were evaluated using F-ScoreM, RecallM, PrecissionM, and overall Accuracy. The experimental results indicate that using the One-Vs-Rest-Classification strategy for this multi-class Resume Classification task, the SVM class of Machine Learning algorithms performed better on the study dataset with over 96% overall accuracy. The promising results suggest that NLP & ML techniques employed in this study could be used for the Resume Classification task.

Keywords

References

  1. Koyande, B.A., et al., Predictive Human Resource Candidate Ranking System.
  2. Al-Otaibi, S.T. and M. Ykhlef, A survey of job recommender systems. International Journal of Physical Sciences, 2012. 7(29): p. 5127-5142.
  3. Farber, F., T. Weitzel, and T. Keim, An automated recommendation approach to selection in personnel recruitment. AMCIS 2003 proceedings, 2003: p. 302.
  4. Breaugh, J.A., The use of biodata for employee selection: Past research and future directions. Human Resource Management Review, 2009. 19(3): p. 219-231.
  5. Lin, Y., et al., Machine learned resume-job matching solution. arXiv preprint arXiv:1607.07657, 2016.
  6. Yi, X., J. Allan, and W.B. Croft. Matching resumes and jobs based on relevance models. in Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval. 2007.
  7. Sebastiani, F., Machine learning in automated text categorization. ACM computing surveys (CSUR), 2002. 34(1): p. 1-47.
  8. Nigam, K., et al., Text classification from labeled and unlabeled documents using EM. Machine learning, 2000. 39(2-3): p. 103-134.
  9. Uysal, A.K. and S. Gunal, The impact of preprocessing on text classification. Information Processing & Management, 2014. 50(1): p. 104-112.
  10. Otter, D.W., J.R. Medina, and J.K. Kalita, A Survey of the Usages of Deep Learning for Natural Language Processing. IEEE Transactions on Neural Networks and Learning Systems, 2020: p. 1-21.
  11. Parkhe, V. and B. Biswas, Sentiment analysis of movie reviews: finding most important movie aspects using driving factors. Soft Computing, 2016. 20(9): p. 3373-3379.
  12. Bakshi, R.K., et al. Opinion mining and sentiment analysis. in 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom). 2016. IEEE.
  13. Sivapalan, S., et al. Recommender systems in e-commerce. in 2014 World Automation Congress (WAC). 2014. IEEE.
  14. Srifi, M., et al., Recommender Systems Based on Collaborative Filtering Using Review Texts-A Survey. Information, 2020. 11(6): p. 317.
  15. Mujtaba, G., et al., Email classification research trends: review and open issues. IEEE Access, 2017. 5: p. 9044-9064.
  16. Al-garadi, M.A., et al., Using online social networks to track a pandemic: A systematic review. Journal of biomedical informatics, 2016. 62: p. 1-11.
  17. Mujtaba, G., et al. Automatic text classification of ICD-10 related CoD from complex and free text forensic autopsy reports. in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA). 2016. IEEE.
  18. Gonzalez, T., et al. Adaptive Employee Profile Classification for Resource Planning Tool. in 2012 Annual SRII Global Conference. 2012.
  19. Guo, S., F. Alamudun, and T. Hammond, ResuMatcher: A personalized resume-job matching system. Expert Systems with Applications, 2016. 60: p. 169-182.
  20. Golec, A. and E. Kahya, A fuzzy model for competency-based employee evaluation and selection. Computers & Industrial Engineering, 2007. 52(1): p. 143-161.
  21. Gopalakrishna, S.T. and V. Vijayaraghavan, Automated Tool for Resume Classification Using Sementic Analysis. International Journal of Artificial Intelligence and Applications (IJAIA), 2019. 10(1).
  22. Sayfullina, L., et al. Domain adaptation for resume classification using convolutional neural networks. in International Conference on Analysis of Images, Social Networks and Texts. 2017. Springer.
  23. Ramos, J. Using tf-idf to determine word relevance in document queries. in Proceedings of the first instructional conference on machine learning. 2003. New Jersey, USA.
  24. Xu, J., An extended one-versus-rest support vector machine for multi-label classification. Neurocomputing, 2011. 74(17): p. 3114-3124.
  25. Loper, E. and S. Bird, NLTK: the natural language toolkit. arXiv preprint cs/0205028, 2002.
  26. Kibriya, A.M., et al. Multinomial naive bayes for text categorization revisited. in Australasian Joint Conference on Artificial Intelligence. 2004. Springer.
  27. McCallum, A. and K. Nigam. A comparison of event models for naive bayes text classification. in AAAI-98 workshop on learning for text categorization. 1998. Citeseer.
  28. Raschka, S., Naive bayes and text classification i-introduction and theory. arXiv preprint arXiv:1410.5329, 2014.
  29. Xu, S., Bayesian Naive Bayes classifiers to text classification. Journal of Information Science, 2018. 44(1): p. 48-59.
  30. Scholkopf, B., A.J. Smola, and F. Bach, Learning with kernels: support vector machines, regularization, optimization, and beyond. 2002: MIT press.
  31. Suykens, J.A. and J. Vandewalle, Least squares support vector machine classifiers. Neural processing letters, 1999. 9(3): p. 293-300.