• Title, Summary, Keyword: Decision Tree

Search Result 1,258, Processing Time 0.039 seconds

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

Fuaay Decision Tree Induction to Obliquely Partitioning a Feature Space (특징공간을 사선 분할하는 퍼지 결정트리 유도)

  • Lee, Woo-Hang;Lee, Keon-Myung
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.3
    • /
    • pp.156-166
    • /
    • 2002
  • Decision tree induction is a kind of useful machine learning approach for extracting classification rules from a set of feature-based examples. According to the partitioning style of the feature space, decision trees are categorized into univariate decision trees and multivariate decision trees. Due to observation error, uncertainty, subjective judgment, and so on, real-world data are prone to contain some errors in their feature values. For the purpose of making decision trees robust against such errors, there have been various trials to incorporate fuzzy techniques into decision tree construction. Several researches hove been done on incorporating fuzzy techniques into univariate decision trees. However, for multivariate decision trees, few research has been done in the line of such study. This paper proposes a fuzzy decision tree induction method that builds fuzzy multivariate decision trees named fuzzy oblique decision trees, To show the effectiveness of the proposed method, it also presents some experimental results.

Research on improving correctness of cardiac disorder data classifier by applying Best-First decision tree method (Best-First decision tree 기법을 적용한 심전도 데이터 분류기의 정확도 향상에 관한 연구)

  • Lee, Hyun-Ju;Shin, Dong-Kyoo;Park, Hee-Won;Kim, Soo-Han;Shin, Dong-Il
    • Journal of Internet Computing and Services
    • /
    • v.12 no.6
    • /
    • pp.63-71
    • /
    • 2011
  • Cardiac disorder data are generally tested using the classifier and QRS-Complex and R-R interval which is used in this experiment are often extracted by ECG(Electrocardiogram) signals. The experimentation of ECG data with classifier is generally performed with SVM(Support Vector Machine) and MLP(Multilayer Perceptron) classifier, but this study experimented with Best-First Decision Tree(B-F Tree) derived from the Dicision Tree among Random Forest classifier algorithms to improve accuracy. To compare and analyze accuracy, experimentation of SVM, MLP, RBF(Radial Basic Function) Network and Decision Tree classifiers are performed and also compared the result of announced papers carried out under same interval and data. Comparing the accuracy of Random Forest classifier with above four ones, Random Forest is the best in accuracy. As though R-R interval was extracted using Band-pass filter in pre-processing of this experiment, in future, more filter study is needed to extract accurate interval.

Development of Decision Tree Program based on Web for Analyzing Clinical Information of Sasang Constitutional Medicine (사상체질 임상정보 분석을 위한 웹 기반의 의사결정 나무 프로그램 개발)

  • Jin, Hee-Jeong;Kim, Myoung-Geun;Kim, Jong-Yeol
    • Korean Journal of Oriental Medicine
    • /
    • v.14 no.3
    • /
    • pp.81-87
    • /
    • 2008
  • Sasanag Contitution Medicine(SCM) is the traditional medicine theory based on constitutional medicine in Korea. It is most import ant that a personal SCM type is determined accurately ahead of applying any Sasang treatments. For this, many researches have been studied to diagnose the SCM type using constitutional clinical data. The decision tree is a tree-structured data-mining methodology. Recently, in the Korean traditional medicine society, there have been several efforts to find diagnosing tools using the decision tree method. So, we developed a decision tree program based on web for analyzing constitutional clinical information. It can use various clinical data as input data, offer filtering function to select clinical data to be used. We can find useful factor to be influential on SCM types using this program.

  • PDF

Improved Decision Tree Algorithms by Considering Variables Interaction (교호효과를 고려한 향상된 의사결정나무 알고리듬에 관한 연구)

  • Kwon, Keunseob;Choi, Gyunghyun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.30 no.4
    • /
    • pp.267-276
    • /
    • 2004
  • Much of previous attention on researches of the decision tree focuses on the splitting criteria and optimization of tree size. Nowadays the quantity of the data increase and relation of variables becomes very complex. And hence, this comes to have plenty number of unnecessary node and leaf. Consequently the confidence of the explanation and forecasting of the decision tree falls off. In this research report, we propose some decision tree algorithms considering the interaction of predictor variables. A generic algorithm, the k-1 Algorithm, dealing with the interaction with a combination of all predictor variable is presented. And then, the extended version k-k Algorithm which considers with the interaction every k-depth with a combination of some predictor variables. Also, we present an improved algorithm by introducing control parameter to the algorithms. The algorithms are tested by real field credit card data, census data, bank data, etc.

Classification Accuracy Improvement for Decision Tree (의사결정트리의 분류 정확도 향상)

  • Rezene, Mehari Marta;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • /
    • pp.787-790
    • /
    • 2017
  • Data quality is the main issue in the classification problems; generally, the presence of noisy instances in the training dataset will not lead to robust classification performance. Such instances may cause the generated decision tree to suffer from over-fitting and its accuracy may decrease. Decision trees are useful, efficient, and commonly used for solving various real world classification problems in data mining. In this paper, we introduce a preprocessing technique to improve the classification accuracy rates of the C4.5 decision tree algorithm. In the proposed preprocessing method, we applied the naive Bayes classifier to remove the noisy instances from the training dataset. We applied our proposed method to a real e-commerce sales dataset to test the performance of the proposed algorithm against the existing C4.5 decision tree classifier. As the experimental results, the proposed method improved the classification accuracy by 8.5% and 14.32% using training dataset and 10-fold crossvalidation, respectively.

A Decision Tree-based Analysis for Paralysis Disease Data

  • Shin, Yangkyu
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.823-829
    • /
    • 2001
  • Even though a rapid development of modem medical science, paralysis disease is a highly dangerous and murderous disease. Shin et al. (1978) constructed the diagnosis expert system which identify a type of the paralysis disease from symptoms of a paralysis disease patients by using the canonical discriminant analysis. The decision tree-based analysis, however, has advantages over the method used in Shin et al. (1998), such as it does not need assumptions - linearity and normality, and suggest appropriate diagnosis procedure which is easily explained. In this paper, we applied the decision tree to construct the model which Identify a type of the paralysis disease.

  • PDF

Diagnostic Classification Scheme in Iranian Breast Cancer Patients using a Decision Tree

  • Malehi, Amal Saki
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.14
    • /
    • pp.5593-5596
    • /
    • 2014
  • Background: The objective of this study was to determine a diagnostic classification scheme using a decision tree based model. Materials and Methods: The study was conducted as a retrospective case-control study in Imam Khomeini hospital in Tehran during 2001 to 2009. Data, including demographic and clinical-pathological characteristics, were uniformly collected from 624 females, 312 of them were referred with positive diagnosis of breast cancer (cases) and 312 healthy women (controls). The decision tree was implemented to develop a diagnostic classification scheme using CART 6.0 Software. The AUC (area under curve), was measured as the overall performance of diagnostic classification of the decision tree. Results: Five variables as main risk factors of breast cancer and six subgroups as high risk were identified. The results indicated that increasing age, low age at menarche, single and divorced statues, irregular menarche pattern and family history of breast cancer are the important diagnostic factors in Iranian breast cancer patients. The sensitivity and specificity of the analysis were 66% and 86.9% respectively. The high AUC (0.82) also showed an excellent classification and diagnostic performance of the model. Conclusions: Decision tree based model appears to be suitable for identifying risk factors and high or low risk subgroups. It can also assists clinicians in making a decision, since it can identify underlying prognostic relationships and understanding the model is very explicit.

Length-of-Stay Prediction Model of Appendicitis using Artificial Neural Networks and Decision Tree (신경망과 의사결정 나무를 이용한 충수돌기염 환자의 재원일수 예측모형 개발)

  • Chung, Suk-Hoon;Han, Woo-Sok;Suh, Yong-Moo;Rhee, Hyun-SiIl
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.10 no.6
    • /
    • pp.1424-1432
    • /
    • 2009
  • For the efficient management of hospital sickbeds, it is important to predict the length of stay (LoS) of appendicitis patients. This study analyzed the patient data to find factors that show high positive correlation with LoS, build LoS prediction models using neural network and decision tree models, and compare their performance. In order to increase the prediction accuracy, we applied the ensemble techniques such as bagging and boosting. Experimental results show that decision tree model which was built with less number of variables shows prediction accuracy almost equal to that of neural network model, and that bagging is better than boosting. In conclusion, since the decision tree model which provides better explanation than neural network model can well predict the LoS of appendicitis patients and can also be used to select the input variables, it is recommended that hospitals make use of the decision tree techniques more actively.

A study of constitution diagnosis using decision tree method (의사결정나무법을 이용한 체질진단에 관한 연구)

  • Lee, Yong-Seop;Park, Seong-Sik;Park, Eun-Kyung
    • Journal of Sasang Constitutional Medicine
    • /
    • v.13 no.2
    • /
    • pp.144-155
    • /
    • 2001
  • By the increasing concern about Sasang Constitution Medicine, its practical use is considered very important in disease prevention and medical treatment. However, the method of constitution classification is depending on the doctor's clinical trials because of the lack of the objective test criteria. This study is trying to improve the objectiveness of diagnosis using a new statistical method, decision tree. Decision tree method-a classification technique in the statistical analysis- was used to analyze the result of QSCCII instead of using discriminant analysis. As a result, 16 among 121 QSCCII questions was selected as important questions and 21 terminal nodes was built to classify the constitution. Using only 16 questions shown in the result of decision tree, we can diagnose and interpret the constitution easily and effectively.

  • PDF