• Title/Summary/Keyword: Decision Tree Algorithm

Search Result 445, Processing Time 0.037 seconds

A Study of Improving on Test Costs in Decision Trees (Decision Tree의 Test Cost 개선에 관한 연구)

  • 석현태
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2002.10c
    • /
    • pp.223-225
    • /
    • 2002
  • Decision tree는 목표 데이터에 대한 계층적 관점을 보여준다는 의미에서 데이터를 보다 잘 이해하는데 많은 도움이 되나 탐욕법(greedy algorithm)에 의한 트리 생성법의 한계로 인해 최적의 예측자라고는 할 수가 없다. 이와 같은 약점을 보완하기 위하여 일반적 방법으로 생성한 decision tree에 대하여 다차원 연관규칙 알고리즘을 적용함으로써 짱은 길이의 최적 부분 규칙집합을 구하는 방법을 제시하였고 실험을 통해 그와 같은 사실을 확인하였다.

  • PDF

A Decision Tree Algorithm using Genetic Programming

  • Park, Chongsun;Ko, Young Kyong
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.3
    • /
    • pp.845-857
    • /
    • 2003
  • We explore the use of genetic programming to evolve decision trees directly for classification problems with both discrete and continuous predictors. We demonstrate that the derived hypotheses of standard algorithms can substantially deviated from the optimum. This deviation is partly due to their top-down style procedures. The performance of the system is measured on a set of real and simulated data sets and compared with the performance of well-known algorithms like CHAID, CART, C5.0, and QUEST. Proposed algorithm seems to be effective in handling problems caused by top-down style procedures of existing algorithms.

Interpretability Comparison of Popular Decision Tree Algorithms (대표적인 의사결정나무 알고리즘의 해석력 비교)

  • Hong, Jung-Sik;Hwang, Geun-Seong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.2
    • /
    • pp.15-23
    • /
    • 2021
  • Most of the open-source decision tree algorithms are based on three splitting criteria (Entropy, Gini Index, and Gain Ratio). Therefore, the advantages and disadvantages of these three popular algorithms need to be studied more thoroughly. Comparisons of the three algorithms were mainly performed with respect to the predictive performance. In this work, we conducted a comparative experiment on the splitting criteria of three decision trees, focusing on their interpretability. Depth, homogeneity, coverage, lift, and stability were used as indicators for measuring interpretability. To measure the stability of decision trees, we present a measure of the stability of the root node and the stability of the dominating rules based on a measure of the similarity of trees. Based on 10 data collected from UCI and Kaggle, we compare the interpretability of DT (Decision Tree) algorithms based on three splitting criteria. The results show that the GR (Gain Ratio) branch-based DT algorithm performs well in terms of lift and homogeneity, while the GINI (Gini Index) and ENT (Entropy) branch-based DT algorithms performs well in terms of coverage. With respect to stability, considering both the similarity of the dominating rule or the similarity of the root node, the DT algorithm according to the ENT splitting criterion shows the best results.

A Development for Short-term Stock Forecasting on Learning Agent System using Decision Tree Algorithm (의사결정 트리를 이용한 학습 에이전트 단기주가예측 시스템 개발)

  • 서장훈;장현수
    • Journal of the Korea Safety Management & Science
    • /
    • v.6 no.2
    • /
    • pp.211-229
    • /
    • 2004
  • The basis of cyber trading has been sufficiently developed with innovative advancement of Internet Technology and the tendency of stock market investment has changed from long-term investment, which estimates the value of enterprises, to short-term investment, which focuses on getting short-term stock trading margin. Hence, this research shows a Short-term Stock Price Forecasting System on Learning Agent System using DTA(Decision Tree Algorithm) ; it collects real-time information of interest and favorite issues using Agent Technology through the Internet, and forms a decision tree, and creates a Rule-Base Database. Through this procedure the Short-term Stock Price Forecasting System provides customers with the prediction of the fluctuation of stock prices for each issue in near future and a point of sales and purchases. A Human being has the limitation of analytic ability and so through taking a look into and analyzing the fluctuation of stock prices, the Agent enables man to trace out the external factors of fluctuation of stock market on real-time. Therefore, we can check out the ups and downs of several issues at the same time and figure out the relationship and interrelation among many issues using the Agent. The SPFA (Stock Price Forecasting System) has such basic four phases as Data Collection, Data Processing, Learning, and Forecasting and Feedback.

Efficient Fuzzy Rule Generation Using Fuzzy Decision Tree (퍼지 결정 트리를 이용한 효율적인 퍼지 규칙 생성)

  • 민창우;김명원;김수광
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.35C no.10
    • /
    • pp.59-68
    • /
    • 1998
  • The goal of data mining is to develop the automatic and intelligent tools and technologies that can find useful knowledge from databases. To meet this goal, we propose an efficient data mining algorithm based on the fuzzy decision tree. The proposed method combines comprehensibility of decision tree such as ID3 and C4.5 and representation power of fuzzy set theory. So, it can generate simple and comprehensive rules describing data. The proposed algorithm consists of two stages: the first stage generates the fuzzy membership functions using histogram analysis, and the second stage constructs a fuzzy decision tree using the fuzzy membership functions. From the testing of the proposed algorithm on the IRIS data and the Wisconsin Breast Cancer data, we found that the proposed method can generate a set of fuzzy rules from data efficiently.

  • PDF

A customer credit Prediction Researched to Improve Credit Stability based on Artificial Intelligence

  • MUN, Ji-Hui;JUNG, Sang Woo
    • Korean Journal of Artificial Intelligence
    • /
    • v.9 no.1
    • /
    • pp.21-27
    • /
    • 2021
  • In this Paper, Since the 1990s, Korea's credit card industry has steadily developed. As a result, various problems have arisen, such as careless customer information management and loans to low-credit customers. This, in turn, had a high delinquency rate across the card industry and a negative impact on the economy. Therefore, in this paper, based on Azure, we analyze and predict the delinquency and delinquency periods of credit loans according to gender, own car, property, number of children, education level, marital status, and employment status through linear regression analysis and enhanced decision tree algorithm. These predictions can consequently reduce the likelihood of reckless credit lending and issuance of credit cards, reducing the number of bad creditors and reducing the risk of banks. In addition, after classifying and dividing the customer base based on the predicted result, it can be used as a basis for reducing the risk of credit loans by developing a credit product suitable for each customer. The predicted result through Azure showed that when predicting with Linear Regression and Boosted Decision Tree algorithm, the Boosted Decision Tree algorithm made more accurate prediction. In addition, we intend to increase the accuracy of the analysis by assigning a number to each data in the future and predicting again.

Adaptive Decision Algorithm for an Improvement of RFID Anti-Collision (RFID의 효율적인 태그인식을 위한 Adaptive Decision 알고리즘)

  • Ko, Young-Eun;Oh, Kyoung-Wook;Bang, Sung-Il
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.44 no.4
    • /
    • pp.1-9
    • /
    • 2007
  • in this paper, we propose the Adaptive Decision Algorithm for RFID Tag Anti-Collision. We study the RFID Tag anti-collision technique of ALOHA and the anti-collision algorithm of binary search. The existing technique is several problems; the transmitted data rate included of data, the recognition time and energy efficiency. For distinction of all tags, the Adaptive Decision algorithm identify smaller one ,each Tag_ID bit's sum of bit '1'. In other words, Adaptive Decision algorithm had standard of selection by actively, the algorithm can reduce unnecessary number of search even than the exisiting algorithm. The Adaptive Decision algorithm had performance test that criterions were reader's number of repetition and number of transmitted bits for understanding tag. We showed the good performance of Adaptive Decision algorithm better than exisiting algorithm.

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Industrial Waste Database Analysis Using Data Mining

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF

Typical Classification of Rural Area Considering Settlement Environment by Decision Tree Method (정주여건을 고려한 의사결정나무기법 활용 농촌지역 유형화)

  • Bae, Seung-Jong;Kim, Dae-Sik;Eun, Sang-Kyu
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.58 no.6
    • /
    • pp.79-92
    • /
    • 2016
  • The objective of this study is to classify the types of rural areas (138 $si{\cdot}gun$) considering settlement environment by Decision Tree Method (CHAID). The CHAID method was used for decision tree algorithm and the seven dependant variables and 5 explanatory variables were selected, respectively. By decision tree method, rural areas were finally classified into six groups through three separate processes. City area, lower area in aging rate and higher area in farmland area ratio was analyzed to be relatively rich rather than other area in the case of settlement environment index. In the future, this study will be able to utilize as a reference to the planning of rural development projects.