Proceedings of the Korea Information Processing Society Conference (한국정보처리학회:학술대회논문집)
- 2017.04a
- /
- Pages.787-790
- /
- 2017
- /
- 2005-0011(pISSN)
- /
- 2671-7298(eISSN)
DOI QR Code
Classification Accuracy Improvement for Decision Tree
의사결정트리의 분류 정확도 향상
- Rezene, Mehari Marta (Dept. of Computer Science, Yonsei University) ;
- Park, Sanghyun (Dept. of Computer Science, Yonsei University)
- 메하리 마르타 레제네 (연세대학교 컴퓨터공학과) ;
- 박상현 (연세대학교 컴퓨터공학과)
- Published : 2017.04.27
Abstract
Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.