• Title/Summary/Keyword: 회귀트리

Search Result 81, Processing Time 0.026 seconds

Prediction of Water Usage in Pig Farm based on Machine Learning (기계학습을 이용한 돈사 급수량 예측방안 개발)

  • Lee, Woongsup;Ryu, Jongyeol;Ban, Tae-Won;Kim, Seong Hwan;Choi, Heechul
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.8
    • /
    • pp.1560-1566
    • /
    • 2017
  • Recently, accumulation of data on pig farm is enabled through the wide spread of smart pig farm equipped with Internet-of-Things based sensors, and various machine learning algorithms are applied on the data in order to improve the productivity of pig farm. Herein, multiple machine learning schemes are used to predict the water usage in pig farm which is known to be one of the most important element in pig farm management. Especially, regression algorithms, which are linear regression, regression tree and AdaBoost regression, and classification algorithms which are logistic classification, decision tree and support vector machine, are applied to derive a prediction scheme which forecast the water usage based on the temperature and humidity of pig farm. Through performance evaluation, we find that the water usage can be predicted with high accuracy. The proposed scheme can be used to detect the malfunction of water system which prevents the death of pigs and reduces the loss of pig farm.

Prediction Model for Unpaid Customers Using Big Data (빅 데이터 기반의 체납 수용가 예측 모델)

  • Jeong, Jaean;Lee, Kyouhwan;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.7
    • /
    • pp.827-833
    • /
    • 2020
  • In this paper, to reduce the unpaid rate of local governments, the internal data elements affecting the arrears in Water-INFOS are searched through interviews with meter readers in certain local governments. Candidate data affecting arrears from national statistical data were derived. The influence of the independent variable on the dependent variable was sampled by examining the disorder of the dependent variable in the data set called information gain. We also evaluated the higher prediction rates of decision tree and logistic regression using n-fold cross-validation. The results confirmed that the decision tree can find more accurate customer payment patterns than logistic regression. In the process of developing an analysis algorithm model using machine learning, the optimal values of two environmental variables, the minimum number of data and the maximum purity, which directly affect the complexity and accuracy of the decision tree, are derived to improve the accuracy of the algorithm.

Segmental duration modelling for Korean text-to-speech synthesis (한국어 음성합성에서 음운지속시간 모델화)

  • Lee YangHee
    • Proceedings of the KSPS conference
    • /
    • 1996.02a
    • /
    • pp.125-135
    • /
    • 1996
  • 본 논문에서는 자연스러운 음성을 합성하기 위하여, 한국어 음운지속시간의 변화에 있어서 문절과 구내의 음절수와 음절의 위치에 의한 영향과 인접하는 음운의 영향에 대하여 통계적으로 분석하였고, 분석된 시간 특징을 제어 요소로 하는 회귀트리를 생성하여 음운 지속시간을 모델 화하였다. 또한, 제안된 음운 지속시간 모델에 의해 예측실험을 행하여, 측정치와 예측치간의 다중 상관계수가 0.74정도이고, 각 음운의 예측오차의 75%이상이 25ms이내로 제안된 모델의 타당성이 입증되었다.

  • PDF

A Study on Factors of Education's Outcome using Decision Trees (의사결정트리를 이용한 교육성과 요인에 관한 연구)

  • Kim, Wan-Seop
    • Journal of Engineering Education Research
    • /
    • v.13 no.4
    • /
    • pp.51-59
    • /
    • 2010
  • In order to manage the lectures efficiently in the university and improve the educational outcome, the process is needed that make diagnosis of the present educational outcome of each classes on a lecture and find factors of educational outcome. In most studies for finding the factors of the efficient lecture, statistical methods such as association analysis, regression analysis are used usually, and recently decision tree analysis is employed, too. The decision tree analysis have the merits that is easy to understand a result model, and to be easy to apply for the decision making, but have the weaknesses that is not strong for characteristic of input data such as multicollinearity. This paper indicates the weaknesses of decision tree analysis, and suggests the experimental solution using multiple decision tree algorithm to supplement these problems. The experimental result shows that the suggested method is more effective in finding the reliable factors of the educational outcome.

  • PDF

Head Pose Estimation Based on Perspective Projection Using PTZ Camera (원근투영법 기반의 PTZ 카메라를 이용한 머리자세 추정)

  • Kim, Jin Suh;Lee, Gyung Ju;Kim, Gye Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.7
    • /
    • pp.267-274
    • /
    • 2018
  • This paper describes a head pose estimation method using PTZ(Pan-Tilt-Zoom) camera. When the external parameters of a camera is changed by rotation and translation, the estimated face pose for the same head also varies. In this paper, we propose a new method to estimate the head pose independently on varying the parameters of PTZ camera. The proposed method consists of 3 steps: face detection, feature extraction, and pose estimation. For each step, we respectively use MCT(Modified Census Transform) feature, the facial regression tree method, and the POSIT(Pose from Orthography and Scaling with ITeration) algorithm. The existing POSIT algorithm does not consider the rotation of a camera, but this paper improves the POSIT based on perspective projection in order to estimate the head pose robustly even when the external parameters of a camera are changed. Through experiments, we confirmed that RMSE(Root Mean Square Error) of the proposed method improve $0.6^{\circ}$ less then the conventional method.

A Study on the Diagnosis of Treeing Breakdown and Fractal Characteristics Using the Method of Acoustic Enission (음향방출 계측법을 이용한 프랙탈 특성과 트리잉 파괴진단에 관한 연구)

  • 김성홍;심종탁;김재환
    • The Proceedings of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.11 no.6
    • /
    • pp.50-56
    • /
    • 1997
  • As the purpose of the breakdown prediction of three degradation of insulating materials caused by partial discharge occurring at various defects in the polymer insulator itself and at the interfaces between electrodes and the insulating materials. Treeing due to partial discharge os one of the main causes of breakdown of the insulating materials. Recently, the necessity of establishing the way to diagnoses the aging of insulation materials and to predict of insulation breakdown become improtant. The purpose of our work are to use acoustic emission System and fractal dimension and to investigated the treeing phenomena in polymeric insulation under appliec AC voltage 11[kV] with an artificial needleshaped void(1.5[mm]) using the above system.

  • PDF

A Study on the Prediction Models of Used Car Prices for Domestic Brands Using Machine Learning (머신러닝을 활용한 브랜드별 국내 중고차 가격 예측 모델에 관한 연구)

  • Seungjun Yim;Joungho Lee;Choonho Ryu
    • Journal of Service Research and Studies
    • /
    • v.13 no.3
    • /
    • pp.105-126
    • /
    • 2023
  • The domestic used car market continues to grow along with the used car online platform service. The used car online platform service discloses vehicle specifications, accident history, inspection history, and detailed options to service consumers. Most of the preceding studies were predictions of used car prices using vehicle specifications and some options for vehicles. As a result of the study, it was confirmed that there was a nonlinear relationship between used car prices and some specification variables. Accordingly, the researchers tried to solve the nonlinear problem by executing a Machine Learning model. In common, the Regression based Machine Learning model had the advantage of knowing the actual influence and direction of variables, but there was a disadvantage of low Cost Function figures compared to the Decision Tree based Machine Learning model. This study attempted to predict used car prices of six domestic brands by utilizing both vehicle specifications and vehicle options. Through this, we tried to collect the advantages of the two types of Machine Learning models. To this end, we sequentially conducted a regression based Machine Learning model and a decision tree based Machine Learning model. As a result of the analysis, the practical influence and direction of each brand variable, and the best tree based Machine Learning model were selected. The implications of this study are as follows. It will help buyers and sellers who use used car online platform services to predict approximate used car prices. And it is hoped that it will help solve the problem caused by information inequality among users of the used car online platform service.

Machine Learning Methods to Predict Vehicle Fuel Consumption

  • Ko, Kwangho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.13-20
    • /
    • 2022
  • It's proposed and analyzed ML(Machine Learning) models to predict vehicle FC(Fuel Consumption) in real-time. The test driving was done for a car to measure vehicle speed, acceleration, road gradient and FC for training dataset. The various ML models were trained with feature data of speed, acceleration and road-gradient for target FC. There are two kind of ML models and one is regression type of linear regression and k-nearest neighbors regression and the other is classification type of k-nearest neighbors classifier, logistic regression, decision tree, random forest and gradient boosting in the study. The prediction accuracy is low in range of 0.5 ~ 0.6 for real-time FC and the classification type is more accurate than the regression ones. The prediction error for total FC has very low value of about 0.2 ~ 2.0% and regression models are more accurate than classification ones. It's for the coefficient of determination (R2) of accuracy score distributing predicted values along mean of targets as the coefficient decreases. Therefore regression models are good for total FC and classification ones are proper for real-time FC prediction.

The Automated Threshold Decision Algorithm for Node Split of Phonetic Decision Tree (음소 결정트리의 노드 분할을 위한 임계치 자동 결정 알고리즘)

  • Kim, Beom-Seung;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.170-178
    • /
    • 2012
  • In the paper, phonetic decision tree of the triphone unit was built for the phoneme-based speech recognition of 640 stations which run by the Korail. The clustering rate was determined by Pearson and Regression analysis to decide threshold used in node splitting. Using the determined the clustering rate, thresholds are automatically decided by the threshold value according to the average clustering rate. In the recognition experiments for verifying the proposed method, the performance improved 1.4~2.3 % absolutely than that of the baseline system.

Application of Multiple Linear Regression Analysis and Tree-Based Machine Learning Techniques for Cutter Life Index(CLI) Prediction (커터수명지수 예측을 위한 다중선형회귀분석과 트리 기반 머신러닝 기법 적용)

  • Ju-Pyo Hong;Tae Young Ko
    • Tunnel and Underground Space
    • /
    • v.33 no.6
    • /
    • pp.594-609
    • /
    • 2023
  • TBM (Tunnel Boring Machine) method is gaining popularity in urban and underwater tunneling projects due to its ability to ensure excavation face stability and minimize environmental impact. Among the prominent models for predicting disc cutter life, the NTNU model uses the Cutter Life Index(CLI) as a key parameter, but the complexity of testing procedures and rarity of equipment make measurement challenging. In this study, CLI was predicted using multiple linear regression analysis and tree-based machine learning techniques, utilizing rock properties. Through literature review, a database including rock uniaxial compressive strength, Brazilian tensile strength, equivalent quartz content, and Cerchar abrasivity index was built, and derived variables were added. The multiple linear regression analysis selected input variables based on statistical significance and multicollinearity, while the machine learning prediction model chose variables based on their importance. Dividing the data into 80% for training and 20% for testing, a comparative analysis of the predictive performance was conducted, and XGBoost was identified as the optimal model. The validity of the multiple linear regression and XGBoost models derived in this study was confirmed by comparing their predictive performance with prior research.