• Title/Summary/Keyword: 의사결정나무회귀분석

Search Result 124, Processing Time 0.025 seconds

A Study for the Development of a Bid Price Rate Prediction Model (낙찰률 예측 모형에 관한 연구)

  • Choi, Bo-Seung;Kang, Hyun-Cheol;Han, Sang-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.23-34
    • /
    • 2011
  • Property auctions have become a new method for real estate investment because the property auction market grows in tandem with the growth of the real estate market. This study focused on the statistical model for predicting bid price rates which is the main index for participants in the real estate auction market. For estimating the monthly bid price rate, we proposed a new method to make up for the mean of regions and terms as well as to reduce the prediction error using a decision tree analysis. We also proposed a linear regression model to predict a bid price rate for individual auction property. We applied the proposed model to apartment auction property and tried to predict the bid price rate as well as categorize individual auction property into an auction grade.

A study on the optimum cutter spacing ratio according to penetration depth using decision tree-based and SVM regressions (의사결정나무 기반 회귀분석과 SVM 회귀분석을 이용한 커터 관입깊이에 따른 최적 커터간격 비 연구)

  • Lee, Gi-Jun;Ryu, Hee-Hwan;Kwon, Tae-Hyuk
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.22 no.5
    • /
    • pp.501-513
    • /
    • 2020
  • Cutter cutting tests for the cutter placement in the cutter head are being conducted through various studies. Although the cutter spacing at the minimum specific energy is mainly reflected in the cutter head design, since the optimum cutter spacing at the same cutter penetration depth varies depending on the rock conditions, studies on deciding the optimum cutter spacing should be actively conducted. The machine learning techniques such as the decision tree-based regression model and the SVM regression model were applied to predict the optimum cutter spacing ratio for the nonlinear relationship between cutter penetration depth and cutter spacing. Since the decision tree-based methods are greatly influenced by the number of data, SVM regression predicted optimum cutter spacing ratio according to the penetration depth more accurately and it is judged that the SVM regression will be effectively used to decide the cutter spacing when designing the cutter head if a large amount of data of the optimum cutter spacing ratio according to the penetration depth is accumulated.

Development of Predictive Model for Length of Stay(LOS) in Acute Stroke Patients using Artificial Intelligence (인공지능을 이용한 급성 뇌졸중 환자의 재원일수 예측모형 개발)

  • Choi, Byung Kwan;Ham, Seung Woo;Kim, Chok Hwan;Seo, Jung Sook;Park, Myung Hwa;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.16 no.1
    • /
    • pp.231-242
    • /
    • 2018
  • The efficient management of the Length of Stay(LOS) is important in hospital. It is import to reduce medical cost for patients and increase profitability for hospitals. In order to efficiently manage LOS, it is necessary to develop an artificial intelligence-based prediction model that supports hospitals in benchmarking and reduction ways of LOS. In order to develop a predictive model of LOS for acute stroke patients, acute stroke patients were extracted from 2013 and 2014 discharge injury patient data. The data for analysis was classified as 60% for training and 40% for evaluation. In the model development, we used traditional regression technique such as multiple regression analysis method, artificial intelligence technique such as interactive decision tree, neural network technique, and ensemble technique which integrate all. Model evaluation used Root ASE (Absolute error) index. They were 23.7 by multiple regression, 23.7 by interactive decision tree, 22.7 by neural network and 22.7 by esemble technique. As a result of model evaluation, neural network technique which is artificial intelligence technique was found to be superior. Through this, the utility of artificial intelligence has been proved in the development of the prediction LOS model. In the future, it is necessary to continue research on how to utilize artificial intelligence techniques more effectively in the development of LOS prediction model.

Power Consumption Forecasting Scheme for Educational Institutions Based on Analysis of Similar Time Series Data (유사 시계열 데이터 분석에 기반을 둔 교육기관의 전력 사용량 예측 기법)

  • Moon, Jihoon;Park, Jinwoong;Han, Sanghoon;Hwang, Eenjun
    • Journal of KIISE
    • /
    • v.44 no.9
    • /
    • pp.954-965
    • /
    • 2017
  • A stable power supply is very important for the maintenance and operation of the power infrastructure. Accurate power consumption prediction is therefore needed. In particular, a university campus is an institution with one of the highest power consumptions and tends to have a wide variation of electrical load depending on time and environment. For this reason, a model that can accurately predict power consumption is required for the effective operation of the power system. The disadvantage of the existing time series prediction technique is that the prediction performance is greatly degraded because the width of the prediction interval increases as the difference between the learning time and the prediction time increases. In this paper, we first classify power data with similar time series patterns considering the date, day of the week, holiday, and semester. Next, each ARIMA model is constructed based on the classified data set and a daily power consumption forecasting method of the university campus is proposed through the time series cross-validation of the predicted time. In order to evaluate the accuracy of the prediction, we confirmed the validity of the proposed method by applying performance indicators.

A Comparative Analysis of Risk Assessment Models for Asbestos Demolition (석면 해체 작업의 위험성평가모델 비교 분석)

  • Kim, Dong-Gyu;Kim, Min-Seung;Lee, Su-Min;Kim, Yu-Jin;Han, Seung-Woo
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.11a
    • /
    • pp.99-100
    • /
    • 2022
  • As the danger of exposure to the asbestos has been revealed, the importance of demolition asbestos in existing buildings has been raised. Extensive body of study has been conducted to evaluate the risk of demolition asbestos, but there were confined types of variables caused by not reflecting categorical information and limitations in collecting quantitative information. Thus, this study aims to derive a model that predicts the risk in workplace of demolition asbestos by collecting categorical and continuous variables. For this purpose, categorical and continuous variables were collected from asbestos demolition reports, and the risk assessment score was set as the dependent variable. In this study, the influence of each variable was identified using logistic regression, and the risk prediction model methodologies were compared through decision tree regression and artificial neural network. As a result, a conditional risk prediction model was derived to evaluate the risk of demolition asbestos, and this model is expected to be used to ensure the safety of asbestos demolition workers.

  • PDF

Identification of major risk factors association with respiratory diseases by data mining (데이터마이닝 모형을 활용한 호흡기질환의 주요인 선별)

  • Lee, Jea-Young;Kim, Hyun-Ji
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.373-384
    • /
    • 2014
  • Data mining is to clarify pattern or correlation of mass data of complicated structure and to predict the diverse outcomes. This technique is used in the fields of finance, telecommunication, circulation, medicine and so on. In this paper, we selected risk factors of respiratory diseases in the field of medicine. The data we used was divided into respiratory diseases group and health group from the Gyeongsangbuk-do database of Community Health Survey conducted in 2012. In order to select major risk factors, we applied data mining techniques such as neural network, logistic regression, Bayesian network, C5.0 and CART. We divided total data into training and testing data, and applied model which was designed by training data to testing data. By the comparison of prediction accuracy, CART was identified as best model. Depression, smoking and stress were proved as the major risk factors of respiratory disease.

Convergence analysis of determinants affecting on geographic variations in the prevalence of arthritis in Korean women using data mining (데이터마이닝을 이용한 여성 관절염 유병률 소지역 간 변이의 융복합 요인분석)

  • Kim, Yoo-Mi;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.13 no.5
    • /
    • pp.277-288
    • /
    • 2015
  • This study aims to identify determinants affecting on geographic variations in the prevalence of arthritis in Korean women using data mining. Data from Korean Community Health Survey 2012 with 249 small districts were analyzed. Socio-demographic, health behavior and status, and morbidity status measures were analyzed using conventional regression model and convergence analysis method such as decision tree for convergence analysis. Rate of workers in agriculture, forestry, and fishing, salaried workers, persons higher than high school graduates, non-treatment of needing care, non-treatment of care because of economic reason, obesity, heavy drunkers, complaining persons of chewing difficulty, persons with experiencing depression, persons with perceiving stress, and persons with diagnosing hypertension and angina pectoris were variation determinants of prevalence of arthritis in 249 small districts and these districts were classified 10 area groups by decision tree model. Our finding suggest that the approach based characteristics by small area groups rather than national wide or individual level would be effective to reduce in variations of prevalence of arthritis.

Development of Traffic Accident Models in Seoul Considering Land Use Characteristics (토지이용특성을 고려한 서울시 교통사고 발생 모형 개발)

  • Lim, Samjin;Park, Juntae
    • Journal of the Society of Disaster Information
    • /
    • v.9 no.1
    • /
    • pp.30-49
    • /
    • 2013
  • In this research we developed a new traffic accident forecasting model on the basis of land use. A new traffic accident forecasting model by type was developed based on market segmentation and further introduction of variables that may reflect characteristics of various regions using Classification and Regression Tree Method. From the results of analysis, activities variables such as the registered population, commuters as well as road size, traffic accidents causing facilities being the subjects of activities were derived as variables explaining traffic accidents.

A Prediction of N-value Using Regression Analysis Based on Data Augmentation (데이터 증강 기반 회귀분석을 이용한 N치 예측)

  • Kim, Kwang Myung;Park, Hyoung June;Lee, Jae Beom;Park, Chan Jin
    • The Journal of Engineering Geology
    • /
    • v.32 no.2
    • /
    • pp.221-239
    • /
    • 2022
  • Unknown geotechnical characteristics are key challenges in the design of piles for the plant, civil and building works. Although the N-values which were read through the standard penetration test are important, those N-values of the whole area are not likely acquired in common practice. In this study, the N-value is predicted by means of regression analysis with artificial intelligence (AI). Big data is important to improve learning performance of AI, so circular augmentation method is applied to build up the big data at the current study. The optimal model was chosen among applied AI algorithms, such as artificial neural network, decision tree and auto machine learning. To select optimal model among the above three AI algorithms is to minimize the margin of error. To evaluate the method, actual data and predicted data of six performed projects in Poland, Indonesia and Malaysia were compared. As a result of this study, the AI prediction of this method is proven to be reliable. Therefore, it is realized that the geotechnical characteristics of non-boring points were predictable and the optimal arrangement of structure could be achieved utilizing three dimensional N-value distribution map.

A study on the behavior of cosmetic customers (화장품구매 자료를 통한 고객 구매행태 분석)

  • Cho, Dae-Hyeon;Kim, Byung-Soo;Seok, Kyung-Ha;Lee, Jong-Un;Kim, Jong-Sung;Kim, Sun-Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.4
    • /
    • pp.615-627
    • /
    • 2009
  • In micro marketing promotion, it is important to know the behavior of customers. In this study we are interested in the forecasting of repurchase of customers from customers' behavior. By analyzing the cosmetic transaction data we derive some variables which play an important role in the knowledge of the customers' behavior and in the modeling of repurchase. As modeling tools we use the decision tree, logistic regression and neural network model. Finally we decide to use the decision tree as a final model since it yields the smallest RASE (root average squared error) and the greatest correct classification rate.

  • PDF