• Title, Summary, Keyword: Decision Tree

Search Result 1,258, Processing Time 0.038 seconds

A study on decision tree creation using marginally conditional variables (주변조건부 변수를 이용한 의사결정나무모형 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.299-307
    • /
    • 2012
  • Data mining is a method of searching for an interesting relationship among items in a given database. The decision tree is a typical algorithm of data mining. The decision tree is the method that classifies or predicts a group as some subgroups. In general, when researchers create a decision tree model, the generated model can be complicated by the standard of model creation and the number of input variables. In particular, if the decision trees have a large number of input variables in a model, the generated models can be complex and difficult to analyze model. When creating the decision tree model, if there are marginally conditional variables (intervening variables, external variables) in the input variables, it is not directly relevant. In this study, we suggest the method of creating a decision tree using marginally conditional variables and apply to actual data to search for efficiency.

A Study of Improving on Test Costs in Decision Trees (Decision Tree의 Test Cost 개선에 관한 연구)

  • 석현태
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.223-225
    • /
    • 2002
  • Decision tree는 목표 데이터에 대한 계층적 관점을 보여준다는 의미에서 데이터를 보다 잘 이해하는데 많은 도움이 되나 탐욕법(greedy algorithm)에 의한 트리 생성법의 한계로 인해 최적의 예측자라고는 할 수가 없다. 이와 같은 약점을 보완하기 위하여 일반적 방법으로 생성한 decision tree에 대하여 다차원 연관규칙 알고리즘을 적용함으로써 짱은 길이의 최적 부분 규칙집합을 구하는 방법을 제시하였고 실험을 통해 그와 같은 사실을 확인하였다.

  • PDF

Decision Tree Techniques with Feature Reduction for Network Anomaly Detection (네트워크 비정상 탐지를 위한 속성 축소를 반영한 의사결정나무 기술)

  • Kang, Koohong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.29 no.4
    • /
    • pp.795-805
    • /
    • 2019
  • Recently, there is a growing interest in network anomaly detection technology to tackle unknown attacks. For this purpose, diverse studies using data mining, machine learning, and deep learning have been applied to detect network anomalies. In this paper, we evaluate the decision tree to see its feasibility for network anomaly detection on NSL-KDD data set, which is one of the most popular data mining techniques for classification. In order to handle the over-fitting problem of decision tree, we select 13 features from the original 41 features of the data set using chi-square test, and then model the decision tree using TensorFlow and Scik-Learn, yielding 84% and 70% of binary classification accuracies on the KDDTest+ and KDDTest-21 of NSL-KDD test data set. This result shows 3% and 6% improvements compared to the previous 81% and 64% of binary classification accuracies by decision tree technologies, respectively.

NPC Control Model for Defense in Soccer Game Applying the Decision Tree Learning Algorithm (결정트리 학습 알고리즘을 활용한 축구 게임 수비 NPC 제어 방법)

  • Cho, Dal-Ho;Lee, Yong-Ho;Kim, Jin-Hyung;Park, So-Young;Rhee, Dae-Woong
    • Journal of Korea Game Society
    • /
    • v.11 no.6
    • /
    • pp.61-70
    • /
    • 2011
  • In this paper, we propose a defense NPC control model in the soccer game by applying the Decision Tree learning algorithm. The proposed model extracts the direction patterns and the action patterns generated by many soccer game users, and applies these patterns to the Decision Tree learning algorithm. Then, the proposed model decides the direction and the action according to the learned Decision Tree. Experimental results show that the proposed model takes some time to learn the Decision Tree while the proposed model takes 0.001-0.003 milliseconds to decide the direction and the action based on the learned Decision Tree. Therefore, the proposed model can control NPC in the soccer game system in real time. Also, the proposed model achieves higher accuracy than a previous model (Letia98); because the proposed model can utilize current state information, its analyzed information, and previous state information.

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.671-678
    • /
    • 2011
  • Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.

Decision Tree based Scheduling for Static and Dynamic Flexible Job Shops with Multiple Process Plans (다중 공정계획을 가지는 정적/동적 유연 개별공정에 대한 의사결정 나무 기반 스케줄링)

  • Yu, Jae-Min;Doh, Hyoung-Ho;Kwon, Yong-Ju;Shin, Jeong-Hoon;Kim, Hyung-Won;Nam, Sung-Ho;Lee, Dong-Ho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.32 no.1
    • /
    • pp.25-37
    • /
    • 2015
  • This paper suggests a decision tree based approach for flexible job shop scheduling with multiple process plans. The problem is to determine the operation/machine pairs and the sequence of the jobs assigned to each machine. Two decision tree based scheduling mechanisms are developed for static and dynamic flexible job shops. In the static case, all jobs are given in advance and the decision tree is used to select a priority dispatching rule to process all the jobs. Also, in the dynamic case, the jobs arrive over time and the decision tree, updated regularly, is used to select a priority rule in real-time according to a rescheduling strategy. The two decision tree based mechanisms were applied to a flexible job shop case with reconfigurable manufacturing cells and a conventional job shop, and the results are reported for various system performance measures.

A review of tree-based Bayesian methods

  • Linero, Antonio R.
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.543-559
    • /
    • 2017
  • Tree-based regression and classification ensembles form a standard part of the data-science toolkit. Many commonly used methods take an algorithmic view, proposing greedy methods for constructing decision trees; examples include the classification and regression trees algorithm, boosted decision trees, and random forests. Recent history has seen a surge of interest in Bayesian techniques for constructing decision tree ensembles, with these methods frequently outperforming their algorithmic counterparts. The goal of this article is to survey the landscape surrounding Bayesian decision tree methods, and to discuss recent modeling and computational developments. We provide connections between Bayesian tree-based methods and existing machine learning techniques, and outline several recent theoretical developments establishing frequentist consistency and rates of convergence for the posterior distribution. The methodology we present is applicable for a wide variety of statistical tasks including regression, classification, modeling of count data, and many others. We illustrate the methodology on both simulated and real datasets.

A methodology for Internet Customer segmentation using Decision Trees

  • Cho, Y.B.;Kim, S.H.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • /
    • pp.206-213
    • /
    • 2003
  • Application of existing decision tree algorithms for Internet retail customer classification is apt to construct a bushy tree due to imprecise source data. Even excessive analysis may not guarantee the effectiveness of the business although the results are derived from fully detailed segments. Thus, it is necessary to determine the appropriate number of segments with a certain level of abstraction. In this study, we developed a stopping rule that considers the total amount of information gained while generating a rule tree. In addition to forwarding from root to intermediate nodes with a certain level of abstraction, the decision tree is investigated by the backtracking pruning method with misclassification loss information.

  • PDF

Analysis of the Characteristics of the Older Adults with Depression Using Data Mining Decision Tree Analysis (의사결정나무 분석법을 활용한 우울 노인의 특성 분석)

  • Park, Myonghwa;Choi, Sora;Shin, A Mi;Koo, Chul Hoi
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.1
    • /
    • pp.1-10
    • /
    • 2013
  • Purpose: The purpose of this study was to develop a prediction model for the characteristics of older adults with depression using the decision tree method. Methods: A large dataset from the 2008 Korean Elderly Survey was used and data of 14,970 elderly people were analyzed. Target variable was depression and 53 input variables were general characteristics, family & social relationship, economic status, health status, health behavior, functional status, leisure & social activity, quality of life, and living environment. Data were analyzed by decision tree analysis, a data mining technique using SPSS Window 19.0 and Clementine 12.0 programs. Results: The decision trees were classified into five different rules to define the characteristics of older adults with depression. Classification & Regression Tree (C&RT) showed the best prediction with an accuracy of 80.81% among data mining models. Factors in the rules were life satisfaction, nutritional status, daily activity difficulty due to pain, functional limitation for basic or instrumental daily activities, number of chronic diseases and daily activity difficulty due to disease. Conclusion: The different rules classified by the decision tree model in this study should contribute as baseline data for discovering informative knowledge and developing interventions tailored to these individual characteristics.