• Title/Summary/Keyword: Decision Tree Regression

Search Result 328, Processing Time 0.033 seconds

통계적 분류방법을 이용한 문화재 정보 분석

  • Kang, Min-Gu;Sung, Su-Jin;Lee, Jin-Young;Na, Jong-Hwa
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2009.05a
    • /
    • pp.120-125
    • /
    • 2009
  • 본 논문에서는 통계적 분류방법을 이용하여 문화재 자료의 분석을 수행하였다. 분류방법으로는 선형판별분석, 로지스틱회귀분석, 의사결정나무분석, 신경망분석, SVM분석을 사용하였다. 각각의 분류방법에 대한 개념 및 이론에 대해 간략히 소개하고, 실제자료 분석에서는 "지역별 문화재 통계분석 및 모형개발 연구 1차(2008)"에 사용된 자료 중 익산시 자료를 근거로 매장문화재에 대한 분류방법별 적합모형을 구축하였다. 구축된 모형과 모의실험의 결과를 통해 각각의 적합모형에 대한 비교를 수행하여 모형의 성능을 비교하였다. 분석에 사용된 도구로는 최근 가장 관심을 갖는 R-project를 사용하였다.

  • PDF

A GA-based Binary Classification Method for Bankruptcy Prediction (도산예측을 위한 유전 알고리듬 기반 이진분류기법의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.33 no.2
    • /
    • pp.1-16
    • /
    • 2008
  • The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with ones of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a premising alternative to the existing methods for bankruptcy prediction.

A GA-based Classification Model for Predicting Consumer Choice (유전 알고리듬 기반 제품구매예측 모형의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.34 no.3
    • /
    • pp.29-41
    • /
    • 2009
  • The purpose of this paper is to develop a new classification method for predicting consumer choice based on genetic algorithm, and to validate Its prediction power over existing methods. To serve this purpose, we propose a hybrid model, and discuss Its methodological characteristics in comparison with other existing classification methods. Also, we conduct a series of experiments employing survey data of consumer choices of MP3 players to assess the prediction power of the model. The results show that the suggested model in this paper is statistically superior to the existing methods such as logistic regression model, artificial neural network model and decision tree model in terms of prediction accuracy. The model is also shown to have an advantage of providing several strategic information of practical use for consumer choice.

A Study on the Failure Effect Analysis of Overhead Transformer Considering Weather (기상요인에 따른 가공변압기의 고장영향 분석에 관한 연구)

  • Oh, Do-Eun;Jang, Seung-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.5
    • /
    • pp.857-862
    • /
    • 2017
  • The management of the electric power facilities became important in accordance with the industrial development and electric power facilities were influenced by weather. Even if the same kind of electric power facilities is estimated for extracting the time-varying failure rate, the failure rate could be different depending on external effect such as climate. This research will show the data mining modeling of the weather-related outage and influence of weather on the electric power facility with recent data.

A GA-based Classification Model for Predicting Consumer Choice (유전 알고리듬 기반 제품구매예측 모형의 개발)

  • Min, Jae-Hyeong;Jeong, Cheol-U
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 2008.10a
    • /
    • pp.1-7
    • /
    • 2008
  • The purpose of this paper is to develop a new classification method for predicting consumer choice based on genetic algorithm, and to validate its prediction power over existing methods. To serve this purpose, we propose a hybrid model, and discuss its methodological characteristics in comparison with other existing classification methods. Also, to assess the prediction power of the model, we conduct a series of experiments employing survey data of consumer choices of MP3 players. The results show that the suggested model in this paper is statistically superior to the existing methods such as logistic regression model, artificial neural network model and decision tree model in terms of prediction accuracy. The model is also shown to have an advantage of providing several strategic information of practical use for consumer choice.

  • PDF

Short-term Water Demand Forecasting Algorithm Based on Kalman Filtering with Data Mining (데이터 마이닝과 칼만필터링에 기반한 단기 물 수요예측 알고리즘)

  • Choi, Gee-Seon;Shin, Gang-Wook;Lim, Sang-Heui;Chun, Myung-Geun
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.15 no.10
    • /
    • pp.1056-1061
    • /
    • 2009
  • This paper proposes a short-term water demand forecasting algorithm based on kalman filtering with data mining for sustainable water supply and effective energy saving. The proposed algorithm utilizes a mining method of water supply data and a decision tree method with special days like Chuseok. And the parameters of MLAR (Multi Linear Auto Regression) model are estimated by Kalman filtering algorithm. Thus, we can achieve the practicality of the proposed forecasting algorithm through the good results applied to actual operation data.

A Six Sigma Methodology Using Data Mining : A Case Study of "P" Steel Manufacturing Company (데이터 마이닝 기반의 6 시그마 방법론 : 철강산업 적용사례)

  • Jang, Gil-Sang
    • The Journal of Information Systems
    • /
    • v.20 no.3
    • /
    • pp.1-24
    • /
    • 2011
  • Recently, six sigma has been widely adopted in a variety of industries as a disciplined, data-driven problem solving approach or methodology supported by a handful of powerful statistical tools in order to reduce variation through continuous process improvement. Also, data mining has been widely used to discover unknown knowledge from a large volume of data using various modeling techniques such as neural network, decision tree, regression analysis, etc. This paper proposes a six sigma methodology based on data mining for effectively and efficiently processing massive data in driving six sigma projects. The proposed methodology is applied in the hot stove system which is a major energy-consuming process in a "P" steel company for improvement of heat efficiency through reduction of energy consumption. The results show optimal operation conditions and reduction of the hot stove energy cost by 15%.

Study on the Comparison and Analysis of Data Mining Models for the Efficient Customer Credit Evaluation (효율적인 신용평가를 위한 데이터마이닝 모형의 비교.분석에 관한 연구)

  • 김갑식
    • Journal of Information Technology Applications and Management
    • /
    • v.11 no.1
    • /
    • pp.161-174
    • /
    • 2004
  • This study is intended to suggest1 the optimized data mining model for the efficient customer credit evaluation in the capital finance industry. To accomplish the research objective, various data mining models for the customer credit evaluation are compared and analyzed. Furthermore, existing models such as Multi-Layered Perceptrons, Multivariate Discrimination Analysis, Radial Basis Function, Decision Tree, and Logistic Regression are employed for analyzing the customer information in the capital finance market and the detailed data of capital financing transactions. Finally, the data from the integrated model utilizing a genetic algorithm is compared with those of each individual model mentioned above. The results reveals that the integrated model is superior to other existing models.

  • PDF

Characteristics on Inconsistency Pattern Modeling as Hybrid Data Mining Techniques (혼합 데이터 마이닝 기법인 불일치 패턴 모델의 특성 연구)

  • Hur, Joon;Kim, Jong-Woo
    • Journal of Information Technology Applications and Management
    • /
    • v.15 no.1
    • /
    • pp.225-242
    • /
    • 2008
  • PM (Inconsistency Pattern Modeling) is a hybrid supervised learning technique using the inconsistence pattern of input variables in mining data sets. The IPM tries to improve prediction accuracy by combining more than two different supervised learning methods. The previous related studies have shown that the IPM was superior to the single usage of an existing supervised learning methods such as neural networks, decision tree induction, logistic regression and so on, and it was also superior to the existing combined model methods such as Bagging, Boosting, and Stacking. The objectives of this paper is explore the characteristics of the IPM. To understand characteristics of the IPM, three experiments were performed. In these experiments, there are high performance improvements when the prediction inconsistency ratio between two different supervised learning techniques is high and the distance among supervised learning methods on MDS (Multi-Dimensional Scaling) map is long.

  • PDF

Churn Analysis for the First Successful Candidates in the Entrance Examination for K University

  • Kim, Kyu-Il;Kim, Seung-Han;Kim, Eun-Young;Kim, Hyun;Yang, Jae-Wan;Cho, Jang-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.1-10
    • /
    • 2007
  • In this paper, we focus on churn analysis for the first successful candidates in the entrance examination on 2006 year using Clementine, data mining tool. The goal of this study is to apply decision tree including C5.0 and CART algorithms, neural network and logistic regression techniques to predict a successful candidate churn. And we analyze the churning and nochurning successful candidates and why the successful candidates churn and which successful candidates are most likely to churn in the future using data from entrance examination data of K university on 2006 year.

  • PDF