• 제목/요약/키워드: decision tree regression

Search Result 324, Processing Time 0.027 seconds

A Study on Factors of Elderly Residential Care Service Utilization for using Decision Tree Regression (의사결정분석을 이용한 우리나라 노인의 요양시설서비스 이용 결정요인에 관한 연구)

  • Lim, Jeong-Gi
    • Korean Journal of Social Welfare
    • /
    • v.60 no.3
    • /
    • pp.129-150
    • /
    • 2008
  • This study examined the factors affecting service utilization of elderly residential care among long term care services recipients during long term care insurance pilot project period in Korea. Help-seeking Behavior model developed by Andersen and Newman(1973) was used to analyze the factors affecting their utilization residential care service among 1,939 long term care services recipients. Frequency and Decision Tree Regression analysis on SPSS 13.0 used. Analyses show strong significant factor is service preference(predisposing factors), and then significant factors are enabling factors such as co-residence type, household income. According to this results, need factors such as cognition disorder, problem behavior, ADL and IADL disabilities are affecting utilization behavior of elderly residential care services. These findings provide implications and suggestions about how long term care service system would be settled in Korea. And these finding provide information about target-efficient long term care continuum system to policy makers and helping professionals.

  • PDF

Study on the ensemble methods with kernel ridge regression

  • Kim, Sun-Hwa;Cho, Dae-Hyeon;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.375-383
    • /
    • 2012
  • The purpose of the ensemble methods is to increase the accuracy of prediction through combining many classifiers. According to recent studies, it is proved that random forests and forward stagewise regression have good accuracies in classification problems. However they have great prediction error in separation boundary points because they used decision tree as a base learner. In this study, we use the kernel ridge regression instead of the decision trees in random forests and boosting. The usefulness of our proposed ensemble methods was shown by the simulation results of the prostate cancer and the Boston housing data.

An Analysis of Choice Behavior for Tour Type of Commercial Vehicle using Decision Tree (의사결정나무를 이용한 화물자동차 투어유형 선택행태 분석)

  • Kim, Han-Su;Park, Dong-Ju;Kim, Chan-Seong;Choe, Chang-Ho;Kim, Gyeong-Su
    • Journal of Korean Society of Transportation
    • /
    • v.28 no.6
    • /
    • pp.43-54
    • /
    • 2010
  • In recent years there have been studies on tour based approaches for freight travel demand modelling. The purpose of this paper is to analyze tour type choice behavior of commercial vehicles which are divided into round trips and chained tours. The methods of the study are based on the decision tree and the logit model. The results indicates that the explanation variables for classifying tour types of commercial vehicles are loading factor, average goods quantity, and total goods quantity. The results of the decision tree method are similar to those of logit model. In addition, the explanation variables for tour type classification of small trucks are not different from those for medium trucks', implying that the most important factor on the vehicle tour planning is how to load goods such as shipment size and total quantity.

A Study on the Combined Decision Tree(C4.5) and Neural Network Algorithm for Classification of Mobile Telecommunication Customer (이동통신고객 분류를 위한 의사결정나무(C4.5)와 신경망 결합 알고리즘에 관한 연구)

  • 이극노;이홍철
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.1
    • /
    • pp.139-155
    • /
    • 2003
  • This paper presents the new methodology of analyzing and classifying patterns of customers in mobile telecommunication market to enhance the performance of predicting the credit information based on the decision tree and neural network. With the application of variance selection process from decision tree, the systemic process of defining input vector's value and the rule generation were developed. In point of customer management, this research analyzes current customers and produces the patterns of them so that the company can maintain good customer relationship and makes special management on the customer who has huh potential of getting out of contract in advance. The real implementation of proposed method shows that the predicted accuracy is higher than existing methods such as decision tree(CART, C4.5), regression, neural network and combined model(CART and NN).

  • PDF

Customer Segmentation of a Home Study Company using a Hybrid Decision Tree and Artificial Neural Network Model (하이브리드 의사결정나무와 인공신경망 모델을 이용한 방문학습지사의 고객세분화)

  • Seo Kwang-Kyu;Ahn Beum-Jun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.3
    • /
    • pp.518-523
    • /
    • 2006
  • Due to keen competition among companies, they have segmented customers and they are trying to offer specially targeted customer by means of the distinguished method. In accordance, data mining techniques are noted as the effective method that extracts useful information. This paper explores customer segmentation of the home study company using a hybrid decision tree and artificial neural network model. With the application of variance selection process from decision tree, the systemic process of defining input vector's value and the rule generation were developed. In point of customer management, this research analyzes current customers and produces the patterns of them so that the company can maintain good customer relationship. The case study shows that the predicted accuracy of the proposed model is higher than those of regression, decision tree (CART), artificial neural networks.

  • PDF

The Automated Threshold Decision Algorithm for Node Split of Phonetic Decision Tree (음소 결정트리의 노드 분할을 위한 임계치 자동 결정 알고리즘)

  • Kim, Beom-Seung;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.170-178
    • /
    • 2012
  • In the paper, phonetic decision tree of the triphone unit was built for the phoneme-based speech recognition of 640 stations which run by the Korail. The clustering rate was determined by Pearson and Regression analysis to decide threshold used in node splitting. Using the determined the clustering rate, thresholds are automatically decided by the threshold value according to the average clustering rate. In the recognition experiments for verifying the proposed method, the performance improved 1.4~2.3 % absolutely than that of the baseline system.

Development of a Detection Model for the Companies Designated as Administrative Issue in KOSDAQ Market (KOSDAQ 시장의 관리종목 지정 탐지 모형 개발)

  • Shin, Dong-In;Kwahk, Kee-Young
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.157-176
    • /
    • 2018
  • The purpose of this research is to develop a detection model for companies designated as administrative issue in KOSDAQ market using financial data. Administration issue designates the companies with high potential for delisting, which gives them time to overcome the reasons for the delisting under certain restrictions of the Korean stock market. It acts as an alarm to inform investors and market participants of which companies are likely to be delisted and warns them to make safe investments. Despite this importance, there are relatively few studies on administration issues prediction model in comparison with the lots of studies on bankruptcy prediction model. Therefore, this study develops and verifies the detection model of the companies designated as administrative issue using financial data of KOSDAQ companies. In this study, logistic regression and decision tree are proposed as the data mining models for detecting administrative issues. According to the results of the analysis, the logistic regression model predicted the companies designated as administrative issue using three variables - ROE(Earnings before tax), Cash flows/Shareholder's equity, and Asset turnover ratio, and its overall accuracy was 86% for the validation dataset. The decision tree (Classification and Regression Trees, CART) model applied the classification rules using Cash flows/Total assets and ROA(Net income), and the overall accuracy reached 87%. Implications of the financial indictors selected in our logistic regression and decision tree models are as follows. First, ROE(Earnings before tax) in the logistic detection model shows the profit and loss of the business segment that will continue without including the revenue and expenses of the discontinued business. Therefore, the weakening of the variable means that the competitiveness of the core business is weakened. If a large part of the profits is generated from one-off profit, it is very likely that the deterioration of business management is further intensified. As the ROE of a KOSDAQ company decreases significantly, it is highly likely that the company can be delisted. Second, cash flows to shareholder's equity represents that the firm's ability to generate cash flow under the condition that the financial condition of the subsidiary company is excluded. In other words, the weakening of the management capacity of the parent company, excluding the subsidiary's competence, can be a main reason for the increase of the possibility of administrative issue designation. Third, low asset turnover ratio means that current assets and non-current assets are ineffectively used by corporation, or that asset investment by corporation is excessive. If the asset turnover ratio of a KOSDAQ-listed company decreases, it is necessary to examine in detail corporate activities from various perspectives such as weakening sales or increasing or decreasing inventories of company. Cash flow / total assets, a variable selected by the decision tree detection model, is a key indicator of the company's cash condition and its ability to generate cash from operating activities. Cash flow indicates whether a firm can perform its main activities(maintaining its operating ability, repaying debts, paying dividends and making new investments) without relying on external financial resources. Therefore, if the index of the variable is negative(-), it indicates the possibility that a company has serious problems in business activities. If the cash flow from operating activities of a specific company is smaller than the net profit, it means that the net profit has not been cashed, indicating that there is a serious problem in managing the trade receivables and inventory assets of the company. Therefore, it can be understood that as the cash flows / total assets decrease, the probability of administrative issue designation and the probability of delisting are increased. In summary, the logistic regression-based detection model in this study was found to be affected by the company's financial activities including ROE(Earnings before tax). However, decision tree-based detection model predicts the designation based on the cash flows of the company.

Using CART to Evaluate Performance of Tree Model (CART를 이용한 Tree Model의 성능평가)

  • Jung, Yong Gyu;Kwon, Na Yeon;Lee, Young Ho
    • Journal of Service Research and Studies
    • /
    • v.3 no.1
    • /
    • pp.9-16
    • /
    • 2013
  • Data analysis is the universal classification techniques, which requires a lot of effort. It can be easily analyzed to understand the results. Decision tree which is developed by Breiman can be the most representative methods. There are two core contents in decision tree. One of the core content is to divide dimensional space of the independent variables repeatedly, Another is pruning using the data for evaluation. In classification problem, the response variables are categorical variables. It should be repeatedly splitting the dimension of the variable space into a multidimensional rectangular non overlapping share. Where the continuous variables, binary, or a scale of sequences, etc. varies. In this paper, we obtain the coefficients of precision, reproducibility and accuracy of the classification tree to classify and evaluate the performance of the new cases, and through experiments to evaluate.

  • PDF

A Study of Factors Influencing University Royalty through Education Satisfaction (교육만족도를 통한 대학생들의 대학 충성도에 영향을 미치는 요인에 대한 연구)

  • Kang, Min-Chae
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.4
    • /
    • pp.365-374
    • /
    • 2017
  • The purpose of this study is to verify relation between satisfaction of university education and royalty based on analysis of satisfaction survey result of enrolled all students in J regional university. The university royalty in addition to drop out rate is one of the key indicators of managing university performance and it is differentiated approach that has positive perspectives. Based on satisfaction survey results, first, there was a significant difference in satisfaction by school year and grade range. Second, the analysis result of logistic regression method that had been performed to verify the construct which affecting university royalty of students show that satisfaction with lecture, academic guidance, educational environment and self management in academic life were the significant impact on royalty. Also, the decision tree analysis show that top decision factor is self-satisfaction of university life to determine university royalty.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensemble

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.5
    • /
    • pp.617-625
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.