• Title/Summary/Keyword: Classification tree

Search Result 937, Processing Time 0.029 seconds

A Study of Pathogenesis Classification using Decision Tree Method (의사결정나무법을 이이용한 병인(病因)분류에 관한 연구)

  • Lee, Hyuk-Jae;Kim, Min-Yong;Oh, Hwan-Sup;Park, Young-Bae
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.12 no.2
    • /
    • pp.27-40
    • /
    • 2008
  • Background : In spite of the predominant of the theory of Pathogenesis, the method of Pathogenesis classification is depending on the doctor's clinical trials because od the lack of the objective test criteria. Methods and Results : This study is trying to improve the objectiveness of classification using a new statistical method, decision tree. Decision tree method -a classification technique in the statistical analysis- was used to analyze the result of pathogenesis questionnaire instead of using discriminant analysis. As a result, 10 among 38 pathogenesis questionnaire was selected as important questions and 12 terminal nodes was built to classify the pathogenesis. Conclusions : Using only 10 questions shown in the result of decision tree, we can classify and interpret the pathogenesis easily and effectively.

  • PDF

Tree-structured Classification based on Variable Splitting

  • Ahn, Sung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.1
    • /
    • pp.74-88
    • /
    • 1995
  • This article introduces a unified method of choosing the most explanatory and significant multiway partitions for classification tree design and analysis. The method is derived on the impurity reduction (IR) measure of divergence, which is proposed to extend the proportional-reduction-in-error (PRE) measure in the decision-theory context. For the method derivation, the IR measure is analyzed to characterize its statistical properties which are used to consistently handle the subjects of feature formation, feature selection, and feature deletion required in the associated classification tree construction. A numerical example is considered to illustrate the proposed approach.

  • PDF

A study on data mining techniques for soil classification methods using cone penetration test results

  • Junghee Park;So-Hyun Cho;Jong-Sub Lee;Hyun-Ki Kim
    • Geomechanics and Engineering
    • /
    • v.35 no.1
    • /
    • pp.67-80
    • /
    • 2023
  • Due to the nature of the conjunctive Cone Penetration Test(CPT), which does not verify the actual sample directly, geotechnical engineers commonly classify the underground geomaterials using CPT results with the classification diagrams proposed by various researchers. However, such classification diagrams may fail to reflect local geotechnical characteristics, potentially resulting in misclassification that does not align with the actual stratification in regions with strong local features. To address this, this paper presents an objective method for more accurate local CPT soil classification criteria, which utilizes C4.5 decision tree models trained with the CPT results from the clay-dominant southern coast of Korea and the sand-dominant region in South Carolina, USA. The results and analyses demonstrate that the C4.5 algorithm, in conjunction with oversampling, outlier removal, and pruning methods, can enhance and optimize the decision tree-based CPT soil classification model.

Rule Selection Method in Decision Tree Models (의사결정나무 모델에서의 중요 룰 선택기법)

  • Son, Jieun;Kim, Seoung Bum
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.4
    • /
    • pp.375-381
    • /
    • 2014
  • Data mining is a process of discovering useful patterns or information from large amount of data. Decision tree is one of the data mining algorithms that can be used for both classification and prediction and has been widely used for various applications because of its flexibility and interpretability. Decision trees for classification generally generate a number of rules that belong to one of the predefined category and some rules may belong to the same category. In this case, it is necessary to determine the significance of each rule so as to provide the priority of the rule with users. The purpose of this paper is to propose a rule selection method in classification tree models that accommodate the umber of observation, accuracy, and effectiveness in each rule. Our experiments demonstrate that the proposed method produce better performance compared to other existing rule selection methods.

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

A study of constitution diagnosis using decision tree method (의사결정나무법을 이용한 체질진단에 관한 연구)

  • Lee, Yong-Seop;Park, Seong-Sik;Park, Eun-Kyung
    • Journal of Sasang Constitutional Medicine
    • /
    • v.13 no.2
    • /
    • pp.144-155
    • /
    • 2001
  • By the increasing concern about Sasang Constitution Medicine, its practical use is considered very important in disease prevention and medical treatment. However, the method of constitution classification is depending on the doctor's clinical trials because of the lack of the objective test criteria. This study is trying to improve the objectiveness of diagnosis using a new statistical method, decision tree. Decision tree method-a classification technique in the statistical analysis- was used to analyze the result of QSCCII instead of using discriminant analysis. As a result, 16 among 121 QSCCII questions was selected as important questions and 21 terminal nodes was built to classify the constitution. Using only 16 questions shown in the result of decision tree, we can diagnose and interpret the constitution easily and effectively.

  • PDF

Analysis of PD Distribution Characteristics and Comparison of Classification Methods according to Electrical Tree Source in Power Cable (전력용 케이블 시편에서 전기트리 발생원에 따른 부분방전 분포 특성 및 발생원 분류기법 비교)

  • Park, Seong-Hee;Jeong, Hae-Eun;Lim, Kee-Joe;Kang, Seong-Hwa
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.20 no.1
    • /
    • pp.57-64
    • /
    • 2007
  • One of the cause of insulation failure in power cable is well known by electrical treeing discharge. This is occurred for imposed continuous stress at cable. And this event is related to safety, reliability and maintenance. In this paper, throughout analysis of partial discharge(PD) distribution when occurring the electrical tree, is studied for the purpose of knowing of electrical treeing discharge characteristics according to defects. Own characteristic of tree will be differently processed in each defect and this reason is the first purpose of this paper. To acquire PD data, three defective tree models were made. And their own data is shown by the phase-resolved partial discharge method (PRPD). As a result of PRPD, tree discharge sources have their own characteristics. And if other defects (void, metal particle) exist internal power cable then their characteristics are shown very different. This result Is related to the time of breakdown and this is importance of cable diagnosis. And classification method of PD sources was studied in this paper. It needs select the most useful method to apply PD data classification one of the proposed method. To meet the requirement, we select methods of different type. That is, neural network(NN-BP), adaptive neuro-fuzzy inference system and PCA-LDA were applied to result. As a result of, ANFIS shows the highest rate which value is 98 %. Generally, PCA-LDA and ANFIS are better than BP. Finally, we performed classification of tree progress using ANFIS and that result is 92 %.

Game Traffic Classification Using Statistical Characteristics at the Transport Layer

  • Han, Young-Tae;Park, Hong-Shik
    • ETRI Journal
    • /
    • v.32 no.1
    • /
    • pp.22-32
    • /
    • 2010
  • The pervasive game environments have activated explosive growth of the Internet over recent decades. Thus, understanding Internet traffic characteristics and precise classification have become important issues in network management, resource provisioning, and game application development. Naturally, much attention has been given to analyzing and modeling game traffic. Little research, however, has been undertaken on the classification of game traffic. In this paper, we perform an interpretive traffic analysis of popular game applications at the transport layer and propose a new classification method based on a simple decision tree, called an alternative decision tree (ADT), which utilizes the statistical traffic characteristics of game applications. Experimental results show that ADT precisely classifies game traffic from other application traffic types with limited traffic features and a small number of packets, while maintaining low complexity by utilizing a simple decision tree.

Development of the forest type classification technique for the mixed forest with coniferous and broad-leaved species using the high resolution satellite data

  • Sasakawa, Hiroshi;Tsuyuki, Satoshi
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.467-469
    • /
    • 2003
  • This research aimed to develop forest type classification technique for the mixed forest with coniferous and broad-leaved species using the high resolution satellite data. QuickBird data was used as satellite data. The method of this research was to extract satellite data for every single tree crown using image segmentation technique, then to evaluate the accuracy of classification by changing grouping criteria such as tree species, families, coniferous or broad-leaved species, and timber prices. As a result, the classification of tree species and families level was inaccurate, on the other hand, coniferous or broad-leaved species and timber price level was high accurate.

  • PDF

Decision Tree Classifier for Multiple Abstraction Levels of Data (다중 추상화 수준의 데이터를 위한 결정 트리 분류기)

  • Jeong, Min-A;Lee, Do-Heon
    • The KIPS Transactions:PartD
    • /
    • v.10D no.1
    • /
    • pp.23-32
    • /
    • 2003
  • Since the data is collected from disparate sources in many actual data mining environments, it is common to have data values in different abstraction levels. This paper shows that such multiple abstraction levels of data can cause undesirable effects in decision tree classification. After explaining that equalizing abstraction levels by force cannot provide satisfactory solutions of this problem, it presents a method to utilize the data as it is. The proposed method accommodates the generalization/specialization relationship between data values in both of the construction and the class assignment phase of decision tree classification. The experimental results show that the proposed method reduces classification error rates significantly when multiple abstraction levels of data are involved.