• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.029 seconds

Developing the high risk group predictive model for student direct loan default using data mining (데이터마이닝을 이용한 학자금 대출 부실 고위험군 예측모형 개발)

  • Choi, Jae-Seok;Han, Jun-Tae;Kim, Myeon-Jung;Jeong, Jina
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1417-1426
    • /
    • 2015
  • We develop the high risk group predictive model for loan default by utilizing the direct loan data from 2012 to 2014 of the Korea Student Aid Foundation. We perform the decision tree analysis using the data mining methodology and use SAS Enterprise Miner 13.2. As a result of this model, subject types were classified into 25 types. This study shows that the major influencing factors for the loan default are household income, national grant, age, overdue record, level of schooling, field of study, monthly repayment. The high risk group predictive model in this study will be the basis for segmented management service for preventing loan default.

Customer Churn Prediction of Automobile Insurance by Multiple Models (다중모델을 이용한 자동차 보험 고객의 이탈예측)

  • LeeS Jae-Sik;Lee Jin-Chun
    • Journal of Intelligence and Information Systems
    • /
    • v.12 no.2
    • /
    • pp.167-183
    • /
    • 2006
  • Since data mining attempts to find unknown facts or rules by dealing with also vaguely-known data sets, it always suffers from high error rate. In order to reduce the error rate, many researchers have employed multiple models in solving a problem. In this research, we present a new type of multiple models, called DyMoS, whose unique feature is that it classifies the input data and applies the different model developed appropriately for each class of data. In order to evaluate the performance of DyMoS, we applied it to a real customer churn problem of an automobile insurance company, The result shows that the DyMoS outperformed any model which employed only one data mining technique such as artificial neural network, decision tree and case-based reasoning.

  • PDF

Convergence outpatient medical service patient experience research using data mining (데이터마이닝 기법을 이용한 융복합 외래 의료서비스 환자경험조사 연구)

  • Yoo, Jin-Yeong
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.299-306
    • /
    • 2020
  • The purpose of this study is to find out specific measures that can help the management strategy of patient-centered medical institutions by conducting research on patient experience surveys of convergence outpatient medical services using data mining techniques according to changes in patient-centered medical culture. Using the raw data of the 2018 Medical Service Experience Survey, 8,843 people over the age of 15 who had patient experience in outpatient medical services were analyzed. Decision tree analysis was performed. The determinants of satisfaction with outpatient medical services patient experience were the doctor's area and patient's rights protection area, and the determinants of intention to recommend outpatient medical services were the doctor's area and facilities comfort. Women evaluated the experience positively in overall satisfaction as compared to men, and those over the age of 60 positively evaluated the overall satisfaction and intention to recommend. It is significant that the outpatient experience decision-making model is presented, and that the doctor's area, patient's rights protection area, and facility comfort are important factors. Long-term research on the 'Medical Service Experience Survey' is needed, and research on the inpatient medical service experience is needed.

Analysis of the Factors and Patterns Associated with Death in Aircraft Accidents and Incidents Using Data Mining Techniques (데이터 마이닝 기법을 활용한 항공기 사고 및 준사고로 인한 사망 발생 요인 및 패턴 분석)

  • Kim, Jeong-Hun;Kim, Tae-Un;Yoo, Dong-Hee
    • Journal of Digital Convergence
    • /
    • v.17 no.9
    • /
    • pp.79-88
    • /
    • 2019
  • This study analyzes the influential factors and patterns associated with death from aircraft accidents and incidents using data mining techniques. To this end, we used two datasets for aircraft accidents and incidents, one from the National Transportation Safety Board (NTSB) and the other from the Federal Aviation Administration (FAA). We developed our prediction models using the decision tree classifier to predict death from aircraft accidents or aircraft incidents and thereby derive the main cause factors and patterns that can cause death based on these prediction models. In the NTSB data, deaths occurred frequently when the aircraft was destroyed or people were performing dangerous missions or maneuver. In the FAA data, deaths were mainly caused by pilots who were less skilled or less qualified when their aircraft were partially destroyed. Several death-related patterns were also found for parachute jumping and aircraft ascending and descending phases. Using the derived patterns, we proposed helpful strategies to prevent death from the aircraft accidents or incidents.

A Study on the Implementation of an optimized Algorithm for association rule mining system using Fuzzy Utility (Fuzzy Utility를 활용한 연관규칙 마이닝 시스템을 위한 알고리즘의 구현에 관한 연구)

  • Park, In-Kyu;Choi, Gyoo-Seok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.1
    • /
    • pp.19-25
    • /
    • 2020
  • In frequent pattern mining, the uncertainty of each item is accompanied by a loss of information. AAlso, in real environment, the importance of patterns changes with time, so fuzzy logic must be applied to meet these requirements and the dynamic characteristics of the importance of patterns should be considered. In this paper, we propose a fuzzy utility mining technique for extracting frequent web page sets from web log databases through fuzzy utility-based web page set mining. Here, the downward closure characteristic of the fuzzy set is applied to remove a large space by the minimum fuzzy utility threshold (MFUT)and the user-defined percentile(UDP). Extensive performance analyses show that our algorithm is very efficient and scalable for Fuzzy Utility Mining using dynamic weights.

Naval Vessel Spare Parts Demand Forecasting Using Data Mining (데이터마이닝을 활용한 해군함정 수리부속 수요예측)

  • Yoon, Hyunmin;Kim, Suhwan
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.40 no.4
    • /
    • pp.253-259
    • /
    • 2017
  • Recent development in science and technology has modernized the weapon system of ROKN (Republic Of Korea Navy). Although the cost of purchasing, operating and maintaining the cutting-edge weapon systems has been increased significantly, the national defense expenditure is under a tight budget constraint. In order to maintain the availability of ships with low cost, we need accurate demand forecasts for spare parts. We attempted to find consumption pattern using data mining techniques. First we gathered a large amount of component consumption data through the DELIIS (Defense Logistics Intergrated Information System). Through data collection, we obtained 42 variables such as annual consumption quantity, ASL selection quantity, order-relase ratio. The objective variable is the quantity of spare parts purchased in f-year and MSE (Mean squared error) is used as the predictive power measure. To construct an optimal demand forecasting model, regression tree model, randomforest model, neural network model, and linear regression model were used as data mining techniques. The open software R was used for model construction. The results show that randomforest model is the best value of MSE. The important variables utilized in all models are consumption quantity, ASL selection quantity and order-release rate. The data related to the demand forecast of spare parts in the DELIIS was collected and the demand for the spare parts was estimated by using the data mining technique. Our approach shows improved performance in demand forecasting with higher accuracy then previous work. Also data mining can be used to identify variables that are related to demand forecasting.

A Date Mining Approach to Intelligent College Road Map Advice Service (데이터 마이닝을 이용한 지능형 전공지도시스템 연구)

  • Choe, Deok-Won;Jo, Gyeong-Pil;Sin, Jin-Gyu
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2005.05a
    • /
    • pp.266-273
    • /
    • 2005
  • Data mining techniques enable us to generate useful information for decision support from the data sources which are generated and accumulated in the process of routine organizational management activities. College administration system is a typical example that produces a warehouse of student records as each and every student enters a college and undertakes the curricular and extracurricular activities. So far, these data have been utilized to a very limited student service purposes, such as issuance of transcripts, graduation evaluation, GPA calculation, etc. In this paper, we utilize Holland career search test results, TOEIC score, course work list, and GPA score as the input for data mining and generation the student advisory information. Factor analysis, AHP(Analytic Hierarchy Process), artificial neural net, and CART(Classification And Regression Tree) techniques are deployed in the data mining process. Since these data mining techniques are very powerful in processing and discovering useful knowledge and information from large scale student databases, we can expect a highly sophisticated student advisory knowledge and services which may not be obtained with the human student advice experts.

  • PDF

A Study on the Node Split in Decision Tree with Multivariate Target Variables (다변량 목표변수를 갖는 의사결정나무의 노드분리에 관한 연구)

  • Kim, Seong-Jun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.4
    • /
    • pp.386-390
    • /
    • 2003
  • Data mining is a process of discovering useful patterns for decision making from an amount of data. It has recently received much attention in a wide range of business and engineering fields. Classifying a group into subgroups is one of the most important subjects in data mining. Tree-based methods, known as decision trees, provide an efficient way to finding the classification model. The primary concern in tree learning is to minimize a node impurity, which is evaluated using a target variable in the data set. However, there are situations where multiple target variable should be taken into account, for example, such as manufacturing process monitoring, marketing science, and clinical and health analysis. The purpose of this article is to present some methods for measuring the node impurity, which are applicable to data sets with multivariate target variables. For illustration, a numerical cxample is given with discussion.

Design Analysis of Current Density in Lithium Secondary Battery Using Data Mining Techniques (데이터 마이닝을 이용한 리튬 이차전지의 전류밀도 영향인자 분석)

  • Jeong, Dong Ho;Lee, Jongsoo;Choi, Ha-Young
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.38 no.6
    • /
    • pp.677-682
    • /
    • 2014
  • In the present study, a decision tree and artificial neural network were used to determine critical design parameters for lithium ion batteries and compare their performances. First, a design method that used a decision tree-artificial neural network model was used to determine the major design factors among early pole plate design factors that showed nonlinearity. Then, the artificial neural network was used to implement a weighted value analysis of the importance of the design factors and their effect on the current density. The second method involved the use of an artificial neural network model to construct artificial networks without separate determinations of the major early design factors to analyze the connections and weighted values related to the current density.

A study on removal of unnecessary input variables using multiple external association rule (다중외적연관성규칙을 이용한 불필요한 입력변수 제거에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.877-884
    • /
    • 2011
  • The decision tree is a representative algorithm of data mining and used in many domains such as retail target marketing, fraud detection, data reduction, variable screening, category merging, etc. This method is most useful in classification problems, and to make predictions for a target group after dividing it into several small groups. When we create a model of decision tree with a large number of input variables, we suffer difficulties in exploration and analysis of the model because of complex trees. And we can often find some association exist between input variables by external variables despite of no intrinsic association. In this paper, we study on the removal method of unnecessary input variables using multiple external association rules. And then we apply the removal method to actual data for its efficiencies.