• 제목/요약/키워드: Regression trees

검색결과 244건 처리시간 0.026초

도시림의 여름 대기온도 저감효과 - 서울시를 대상으로 - (The Effects of Urban Forest on Summer Air Temperature in Seoul, Korea)

  • 조용현;신수영
    • 한국조경학회지
    • /
    • 제30권4호
    • /
    • pp.28-36
    • /
    • 2002
  • The main purpose of this study was to estimate a new regression model to explain the relationship between urban forest and air temperature in summer, 2001. This study consists of two parts: correlation coefficient analysis and regression analysis. According to correlation coefficient analysis, thermal infra-red radiations of the major land use categories found significant difference in each category. However there were no significant relationship between the data (thermal infra-red radiation and NDVI) derived from Landsat-7 ETM+ image and air temperature at Automatic Weather Stations(AWSs). After estimating various regression models for summer air temperature, the final models were chosen. The final regression models consisted of two variables such as forest m and traffic facilities area. The regression models explained over 78% of the variability in air temperatures. The regression models with variables of forest area and traffic facilities area showed that the coefficient of the first variable was even more significant than the second one. However, the negative impact of the traffic facilities area was slightly greater than the positive impact of the forest area. Consequently, the effects of forest area and traffic facilities area were apparent to explain summer air temperature in Seoul. Therefore two policies have the most important implications to mitigate the summer air temperature in Seoul: to expand and to conserve the urban forest; and to change the Oafnc facilities'characteristics. The results from this study are expected to be useful not merely in informing the public that urban forest mitigates summer air temperahne, but in urging the necessity of budgets for trees and managing urban forests. It is recommended that field swey of summer air temperature be Performed for the vadidation of the models. The main purpose of this study was to estimate a new regression model to explain the relationship between urban forest and air temperature in summer, 2001. This study consists of two parts: correlation coefficient analysis and regression analysis. According to correlation coefficient analysis, thermal infra-red radiations of the major land use categories found significant difference in each category. However there were no significant relationship between the data (thermal infra-red radiation and NDVI) derived from Landsat-7 ETM+ image and air temperature at Automatic Weather Stations(AWSs). After estimating various regression models for summer air temperature, the final models were chosen. The final regression models consisted of two variables such as forest m and traffic facilities area. The regression models explained over 78% of the variability in air temperatures. The regression models with variables of forest area and traffic facilities area showed that the coefficient of the first variable was even more significant than the second one. However, the negative impact of the traffic facilities area was slightly greater than the positive impact of the forest area. Consequently, the effects of forest area and traffic facilities area were apparent to explain summer air temperature in Seoul. Therefore two policies have the most important implications to mitigate the summer air temperature in Seoul: to expand and to conserve the urban forest; and to change the traffic facilities'characteristics. The results from this study are expected to be useful not merely in informing the public that urban forest mitigates summer air temperature, but in urging the necessity of budgets for trees and managing urban forests. It is recommended that field survey of summer air temperature be Performed for the vadidation of the models.

도시 낙엽성 조경수종의 탄소저장 및 흡수 (Carbon Storage and Uptake by Deciduous Tree Species for Urban Landscape)

  • 조현길;안태원
    • 한국조경학회지
    • /
    • 제40권5호
    • /
    • pp.160-168
    • /
    • 2012
  • 본 연구는 직접수확법을 통해 도시 낙엽성 조경수의 탄소저장 및 흡수를 용이하게 추정하는 회귀모델을 제시하고 도시녹지의 탄소저감 계량화에 필요한 기반정보를 구축하였다. 연구대상 수종은 도시조경수로 흔히 식재되는 단풍나무, 느티나무, 왕벚나무 및 은행나무이었다. 수종별로 유목에서 성목에 이르는 일정 간격의 흉고직경 크기를 고려한 수목을 구입하여, 근굴취를 포함하는 직접수확법에 의해 개체당 부위별 및 전체 생체량을 산정하고 탄소저장량을 산출하였다. 또한, 흉고 부위의 수간 원판을 채취하여 직경생장을 분석하고 탄소흡수량을 산정하였다. 흉고직경을 독립변수로 4개 수종별 생장에 따른 단목의 탄소저장 및 흡수를 계량화하는 활용 용이한 회귀모델을 유도하였다. 이들 회귀식의 $r^2$는 0.94~0.99로서 적합도가 상당히 높았다. 단목의 탄소저장량과 탄소흡수량은 모두 직경생장과 더불어 증가하였고, 직경급간 그 차이도 대개 직경이 커질수록 증가하는 경향이었다. 동일 직경에서는 느티나무가 가장 높은 경향이었고, 다음으로 왕벚나무, 은행나무 등의 순이었다. 유도한 회귀식을 적용하면, 흉고직경 15cm인 느티나무 단목은 약 54kg의 탄소를 저장하고 있으며, 연간 7kg의 탄소를 흡수하는 것으로 나타났다. 본 연구는 도시 조경수목의 직접 벌목과 근굴취의 난이성에 기인하여 생체량 확장계수, 지하부/지상부 비율, 직경생장 등 산림수목의 계수를 대용한 기존 연구의 한계성을 극복할 새로운 초석을 마련하였다. 연구결과는 정부나 지자체의 도시녹지 사업과 관련하여 조경수목의 탄소저감을 평가하는 공공기반기술로서 유용하게 활용될 수 있다.

Study on the ensemble methods with kernel ridge regression

  • Kim, Sun-Hwa;Cho, Dae-Hyeon;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.375-383
    • /
    • 2012
  • The purpose of the ensemble methods is to increase the accuracy of prediction through combining many classifiers. According to recent studies, it is proved that random forests and forward stagewise regression have good accuracies in classification problems. However they have great prediction error in separation boundary points because they used decision tree as a base learner. In this study, we use the kernel ridge regression instead of the decision trees in random forests and boosting. The usefulness of our proposed ensemble methods was shown by the simulation results of the prostate cancer and the Boston housing data.

회귀 모델을 활용한 철강 기업의 에너지 소비 예측 (Forecasting Energy Consumption of Steel Industry Using Regression Model)

  • Sung-Ho KANG;Hyun-Ki KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • 제1권2호
    • /
    • pp.21-25
    • /
    • 2023
  • The purpose of this study was to compare the performance using multiple regression models to predict the energy consumption of steel industry. Specific independent variables were selected in consideration of correlation among various attributes such as CO2 concentration, NSM, Week Status, Day of week, and Load Type, and preprocessing was performed to solve the multicollinearity problem. In data preprocessing, we evaluated linear and nonlinear relationships between each attribute through correlation analysis. In particular, we decided to select variables with high correlation and include appropriate variables in the final model to prevent multicollinearity problems. Among the many regression models learned, Boosted Decision Tree Regression showed the best predictive performance. Ensemble learning in this model was able to effectively learn complex patterns while preventing overfitting by combining multiple decision trees. Consequently, these predictive models are expected to provide important information for improving energy efficiency and management decision-making at steel industry. In the future, we plan to improve the performance of the model by collecting more data and extending variables, and the application of the model considering interactions with external factors will also be considered.

마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구 (Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study)

  • 이승훈;임근
    • 대한산업공학회지
    • /
    • 제39권5호
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

Variable Selection with Regression Trees

  • Chang, Young-Jae
    • 응용통계연구
    • /
    • 제23권2호
    • /
    • pp.357-366
    • /
    • 2010
  • Many tree algorithms have been developed for regression problems. Although they are regarded as good algorithms, most of them suffer from loss of prediction accuracy when there are many noise variables. To handle this problem, we propose the multi-step GUIDE, which is a regression tree algorithm with a variable selection process. The multi-step GUIDE performs better than some of the well-known algorithms such as Random Forest and MARS. The results based on simulation study shows that the multi-step GUIDE outperforms other algorithms in terms of variable selection and prediction accuracy. It generally selects the important variables correctly with relatively few noise variables and eventually gives good prediction accuracy.

A comparative assessment of bagging ensemble models for modeling concrete slump flow

  • Aydogmus, Hacer Yumurtaci;Erdal, Halil Ibrahim;Karakurt, Onur;Namli, Ersin;Turkan, Yusuf S.;Erdal, Hamit
    • Computers and Concrete
    • /
    • 제16권5호
    • /
    • pp.741-757
    • /
    • 2015
  • In the last decade, several modeling approaches have been proposed and applied to estimate the high-performance concrete (HPC) slump flow. While HPC is a highly complex material, modeling its behavior is a very difficult issue. Thus, the selection and application of proper modeling methods remain therefore a crucial task. Like many other applications, HPC slump flow prediction suffers from noise which negatively affects the prediction accuracy and increases the variance. In the recent years, ensemble learning methods have introduced to optimize the prediction accuracy and reduce the prediction error. This study investigates the potential usage of bagging (Bag), which is among the most popular ensemble learning methods, in building ensemble models. Four well-known artificial intelligence models (i.e., classification and regression trees CART, support vector machines SVM, multilayer perceptron MLP and radial basis function neural networks RBF) are deployed as base learner. As a result of this study, bagging ensemble models (i.e., Bag-SVM, Bag-RT, Bag-MLP and Bag-RBF) are found superior to their base learners (i.e., SVM, CART, MLP and RBF) and bagging could noticeable optimize prediction accuracy and reduce the prediction error of proposed predictive models.

Studies on Biomass for Young Abies koreana Wilson

  • Lee, Do-Hyung;Yoon, Jun-Hyuck;Woo, Kwan-Soo
    • 한국산림과학회지
    • /
    • 제96권2호
    • /
    • pp.138-144
    • /
    • 2007
  • This study was undertaken to compare the biomass of Abies koreana growing at two sites. A $10{\times}10m$ plot was established in each site of a natural stand in Mt. Jiri and a plantation in Gyeongsan nursery. Five trees of A. koreana were randomly selected in each site. The following traits were investigated from each tree : height, basal diameter, age, weight of stem, branches, and needles as above-ground traits and weight of total roots, horizontal roots, and vertical roots as below-ground traits. In Gyeongsan nursery, age of sample trees was negatively correlated with both height and weight of total stem, while height was highly correlated with weight of horizontal roots. There was high correlation between the basal diameter and weight of total stem, and between the basal diameter and weight of roots. In Mt. Jiri stand, most of the above-ground traits except age were significantly correlated with the below-ground traits. The linear regression equation between the cross section area of base (X) and the weight of total stem (Y) in Gyeongsan nursery was Y=12.66X-12.92, and correlation was significant ($R^2=0.89$). The linear regression equation between the cross section area of base(X) and the weight of total branches (Y) in Mt. Jiri stand was Y=25.51X+6.00, and correlation was highly significant ($R^2=1.0$).

TREE FORM CLASSIFICATION OF OWNER PAYMENT BEHAVIOUR

  • Hanh Tran;David G. Carmichael;Maria C. A. Balatbat
    • 국제학술발표논문집
    • /
    • The 4th International Conference on Construction Engineering and Project Management Organized by the University of New South Wales
    • /
    • pp.526-533
    • /
    • 2011
  • Contracting is said to be a high-risk business, and a common cause of business failure is related to cash management. A contractor's financial viability depends heavily on how actual payments from an owner deviate from those defined in the contract. The paper presents a method for contractors to evaluate the punctuality and fullness of owner payments based on historical behaviour. It does this by classifying owners according to their late and incomplete payment practices. A payment profile of an owner, in the form of aging claims submitted by the contractor, is used as a basis for the method's development. Regression trees are constructed based on three predictor variables, namely, the average time to payment following a claim, the total amount ending up being paid within a certain period and the level of variability in claim response times. The Tree package in the publicly available R program is used for building the trees. The analysis is particularly useful for contractors at the pre-tendering stage, when contractors predict the likely payment scenario in an upcoming project. Based on the method, the contractor can decide whether to tender or not tender, or adjust its financial preparations accordingly. The paper is a contribution in risk management applied to claim and dispute resolution practice. It is argued that by contractors having a better understanding of owner payment behaviour, fewer disputes and contractor business failures will occur.

  • PDF

XGBoost와 SHAP 기법을 활용한 근로자 이직 예측에 관한 연구 (A Study on the Employee Turnover Prediction using XGBoost and SHAP)

  • 이재준;이유린;임도현;안현철
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제30권4호
    • /
    • pp.21-42
    • /
    • 2021
  • Purpose In order for companies to continue to grow, they should properly manage human resources, which are the core of corporate competitiveness. Employee turnover means the loss of talent in the workforce. When an employee voluntarily leaves his or her company, it will lose hiring and training cost and lead to the withdrawal of key personnel and new costs to train a new employee. From an employee's viewpoint, moving to another company is also risky because it can be time consuming and costly. Therefore, in order to reduce the social and economic costs caused by employee turnover, it is necessary to accurately predict employee turnover intention, identify the factors affecting employee turnover, and manage them appropriately in the company. Design/methodology/approach Prior studies have mainly used logistic regression and decision trees, which have explanatory power but poor predictive accuracy. In order to develop a more accurate prediction model, XGBoost is proposed as the classification technique. Then, to compensate for the lack of explainability, SHAP, one of the XAI techniques, is applied. As a result, the prediction accuracy of the proposed model is improved compared to the conventional methods such as LOGIT and Decision Trees. By applying SHAP to the proposed model, the factors affecting the overall employee turnover intention as well as a specific sample's turnover intention are identified. Findings Experimental results show that the prediction accuracy of XGBoost is superior to that of logistic regression and decision trees. Using SHAP, we find that jobseeking, annuity, eng_test, comm_temp, seti_dev, seti_money, equl_ablt, and sati_safe significantly affect overall employee turnover intention. In addition, it is confirmed that the factors affecting an individual's turnover intention are more diverse. Our research findings imply that companies should adopt a personalized approach for each employee in order to effectively prevent his or her turnover.