DOI QR코드

DOI QR Code

Severity-based Fault Prediction using Unsupervised Learning

비감독형 학습 기법을 사용한 심각도 기반 결함 예측

  • Hong, Euyseok (Dept. of Information Systems Engineering, Sungshin Women's University)
  • 홍의석 (성신여자대학교 정보시스템공학과)
  • Received : 2018.04.09
  • Accepted : 2018.06.08
  • Published : 2018.06.30

Abstract

Most previous studies of software fault prediction have focused on supervised learning models for binary classification that determines whether an input module has faults or not. However, binary classification model determines only the presence or absence of faults in the module without considering the complex characteristics of the fault, and supervised model has the limitation that it requires a training data set that most development groups do not have. To solve these two problems, this paper proposes severity-based ternary classification model using unsupervised learning algorithms, and experimental results show that the proposed model has comparable performance to the supervised models.

소프트웨어 결함 예측에 관한 기존의 연구들은 대부분 모델의 입력 모듈이 결함을 가지고 있는지 여부를 판단하는 이진 감독형 분류 모델들에 관한 것들이었다. 하지만 이진 분류 모델은 결함의 복잡한 특성들을 고려하지 않고 단순히 입력 모듈의 결함 유무만을 판단한다는 문제점이 있고, 감독형 모델은 대부분의 개발 집단이 보유하고 있지 않은 훈련 데이터 집합을 필요로 한다는 한계점이 있다. 본 논문은 이러한 두 가지 문제점을 해결하기 위해 비감독형 알고리즘을 사용한 심각도 기반 삼진 분류 모델을 제안하였으며, 평가 실험 결과 제안 모델이 감독형 모델들에 필적하는 예측 성능을 보였다.

Keywords

References

  1. R. Malhotra, "A systematic review of machine learning techniques for software fault prediction," Applied Soft. Computing Vol.27, pp.504-518, 2015. DOI: https://doi.org/10.1016/j.asoc.2014.11.023
  2. Y. Zhou and H. Leung, "Empirical analysis of object-oriented design metrics for predicting high and low severity faults," IEEE Trans. Software Eng., Vol.32, No.10, pp.771-789, Oct. 2006. DOI: https://doi.org/10.1109/tse.2006.102
  3. D. E. Harter, C. F. Kemerer and S. A. Slaughter, "Does Software Process Improvement Reduce the Severity of Defects? A Longitudinal Field Study," IEEE Trans. Software Eng., Vol.38, No.4, pp. 810-827, July 2012. DOI: https://doi.org/10.1109/tse.2011.63
  4. E. Hong and M. Park, "Unsupervised learning model for fault prediction using representative clustering algorithms," KIPS Trans. Software and Data Engineering, Vol.3, No.2, pp.57-64, Feb. 2014. DOI: https://doi.org/10.3745/ktsde.2014.3.2.57
  5. S. Zhong, T. M. Khoshgoftaar, and N. Seliya, "Analyzing Software Measurement Data with Clustering Techniques," IEEE Intelligent Systems, Vol.19, No.2 pp.20-27, 2004. DOI: https://doi.org/10.1109/mis.2004.1274907
  6. P. S. Bishnu and V. Bhattacherjee, "Software fault prediction using quad tree-based k-means clustering algorithm," IEEE Trans. Knowledge and Data Eng., Vol.24, No.6, pp.1146-1150, 2012. DOI: https://doi.org/10.1109/tkde.2011.163
  7. E. Hong, "Software Quality Prediction based on Defect Severity," Journal of the Korea Society of Computer and Information, Vol.20, No.5, pp.73-81, May 2015. https://doi.org/10.9708/jksci.2015.20.5.073
  8. Y. Singh, A. Kaur and R. Malhotra, "Empirical validation of object oriented metrics for predicting fault proneness models," Soft. Quality Journal, Vol.18, pp.3-35, March 2010. DOI: https://doi.org/10.1007/s11219-009-9079-6
  9. WEKA(Waikato Environment for Knowledge Analysis) http://www.cs.waikato.ac.nz/-ml/weka/
  10. E. Hong, "Ambiguity Analysis of Defectiveness in NASA MDP data sets," Journal of the Korea Society of IT Services, Vol.12, No.2, pp.361-371, June 2013. DOI: https://doi.org/10.9716/kits.2013.12.2.361
  11. S. U. Lee and M. B. Choi, "A Definition and Evaluation Criteria for Software Development Success," Journal of the Institute of Internet, Broadcasting and Communication, Vol.12, No.2, pp.233-241, April 2012. DOI: http://dx.doi.org/10.7236/JIWIT.2012.12.2.233