• 제목/요약/키워드: Additive regression models

검색결과 64건 처리시간 0.025초

Using Bayesian tree-based model integrated with genetic algorithm for streamflow forecasting in an urban basin

  • Nguyen, Duc Hai;Bae, Deg-Hyo
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2021년도 학술발표회
    • /
    • pp.140-140
    • /
    • 2021
  • Urban flood management is a crucial and challenging task, particularly in developed cities. Therefore, accurate prediction of urban flooding under heavy precipitation is critically important to address such a challenge. In recent years, machine learning techniques have received considerable attention for their strong learning ability and suitability for modeling complex and nonlinear hydrological processes. Moreover, a survey of the published literature finds that hybrid computational intelligent methods using nature-inspired algorithms have been increasingly employed to predict or simulate the streamflow with high reliability. The present study is aimed to propose a novel approach, an ensemble tree, Bayesian Additive Regression Trees (BART) model incorporating a nature-inspired algorithm to predict hourly multi-step ahead streamflow. For this reason, a hybrid intelligent model was developed, namely GA-BART, containing BART model integrating with Genetic algorithm (GA). The Jungrang urban basin located in Seoul, South Korea, was selected as a case study for the purpose. A database was established based on 39 heavy rainfall events during 2003 and 2020 that collected from the rain gauges and monitoring stations system in the basin. For the goal of this study, the different step ahead models will be developed based in the methods, including 1-hour, 2-hour, 3-hour, 4-hour, 5-hour, and 6-hour step ahead streamflow predictions. In addition, the comparison of the hybrid BART model with a baseline model such as super vector regression models is examined in this study. It is expected that the hybrid BART model has a robust performance and can be an optional choice in streamflow forecasting for urban basins.

  • PDF

AHP 기법의 적용효과및 한계점에 관한 연구 -MIS 성공요인평가를 위한 3가지 통계기법 비교중심- (Application effect and limitation of AHP as a research methodology -A comparison of 3 statistical technique for evaluating MIS success factor-)

  • 윤재곤
    • 한국경영과학회지
    • /
    • 제21권3호
    • /
    • pp.109-125
    • /
    • 1996
  • Biases and errors in the human being's reasoning process have been studied continuously by the researchers, especially psychlogists and social scientists. These bias phenomenon is classified on the basis of the origin, i. e. motivation and cognition. Furthermore the necessity of research on the bias in the management and management information system areas in increased more and more recently, which have their academic backgrounds in the psychology and social science. The biased information stream is transformed into the systematic error due to the motivation and cognitive bias of human-being, then its resulting phenomena are as follows; 1. the availability of salient information 2. preconceived ideas or theories about peoples and event 3. anchoring and perseverence phenomena. In order to reduce the information errors, Satty suggested the Analytic Hierarchy Process (AHP) that is the subject of this paper and that is widely used for evaluation of complex decision making alternatives. THerefore this paper studies AHP's effects and its limitations in applying to the management area. Thus this paper compared the performances of the 3 models : 1 the traditional additive regression model. 2 regression model using the factor score, and 3 the regression model with AHP. As a result, 3 models produce the different outcomes.

  • PDF

시계열 회귀모형에 근거한 자동차 보험료 추정 (Estimating Automobile Insurance Premiums Based on Time Series Regression)

  • 김영화;박원서
    • 응용통계연구
    • /
    • 제26권2호
    • /
    • pp.237-252
    • /
    • 2013
  • 보험료 및 보험료 구성요소에 대한 예측모형은 합리적인 보험료 결정에 필수적이다. 본 연구에서는 가변수 회귀모형, 독립변수 추가모형, 자기회귀 오차모형, 계절형 ARIMA 모형, 개입모형 등 적정한 자동차 대물 손해보험료 추정에 사용되는 다양한 모형을 소개하였다. 또한 실제 자동차 대물 보험료 자료를 이용하여 각 모형을 이용하여 보험료, 심도, 빈도 등을 추정하였으며, 모형의 추정결과는 추정치와 실제 자료값의 차이에 근거한 RMSE(Root Mean Squared Errors) 값을 통해 비교하였다. 실제 자료 분석 결과, 자기회귀 오차모형이 가장 좋은 성능을 보여주는 것을 알 수 있었다.

공통요인분석자혼합모형의 요인점수를 이용한 일반화가법모형 기반 신용평가 (A credit classification method based on generalized additive models using factor scores of mixtures of common factor analyzers)

  • 임수열;백장선
    • Journal of the Korean Data and Information Science Society
    • /
    • 제23권2호
    • /
    • pp.235-245
    • /
    • 2012
  • 로지스틱판별분석은 금융 분야에서 유용하게 사용되고 있는 통계적 기법으로 신용평가 시 해석이 쉽고 우수한 분별력으로 많이 활용되고 있지만 종속변수에 대한 설명변수들의 비선형적인 관계를 설명하는 부분에는 한계점이 있다. 일반화가법모형은 로지스틱판별모형의 장점과 함께 종속변수와 설명변수 사이의 비선형적인 관계도 설명할 수 있다. 그러나 연속형 설명변수의 수가 대단히 많은 경우이 두 방법은 모형에 유의한 변수를 선택해야하는 문제점이 있다. 따라서 본 연구에서는 다수의 연속형 설명변수들을 공통요인분석자혼합모형에 의한 차원축소를 통해 변환된 소수의 요인점수들을 일반화가법모형의 새로운 연속형 설명변수로 사용하여 신용분류를 하는 방법을 제시한다. 실제 금융자료를 이용하여 로지스틱판별모형과 일반화가법모형, 그리고 본 연구에서 제안한 방법에 의한 정분류율을 비교한 결과 본 연구에서 제안한 방법의 분류 성능이 더 우수하였다.

Development of ensemble machine learning models for evaluating seismic demands of steel moment frames

  • Nguyen, Hoang D.;Kim, JunHee;Shin, Myoungsu
    • Steel and Composite Structures
    • /
    • 제44권1호
    • /
    • pp.49-63
    • /
    • 2022
  • This study aims to develop ensemble machine learning (ML) models for estimating the peak floor acceleration and maximum top drift of steel moment frames. For this purpose, random forest, adaptive boosting, gradient boosting regression tree (GBRT), and extreme gradient boosting (XGBoost) models were considered. A total of 621 steel moment frames were analyzed under 240 ground motions using OpenSees software to generate the dataset for ML models. From the results, the GBRT and XGBoost models exhibited the highest performance for predicting peak floor acceleration and maximum top drift, respectively. The significance of each input variable on the prediction was examined using the best-performing models and Shapley additive explanations approach (SHAP). It turned out that the peak ground acceleration had the most significant impact on the peak floor acceleration prediction. Meanwhile, the spectral accelerations at 1 and 2 s had the most considerable influence on the maximum top drift prediction. Finally, a graphical user interface module was created that places a pioneering step for the application of ML to estimate the seismic demands of building structures in practical design.

Random Regression Models Using Legendre Polynomials to Estimate Genetic Parameters for Test-day Milk Protein Yields in Iranian Holstein Dairy Cattle

  • Naserkheil, Masoumeh;Miraie-Ashtiani, Seyed Reza;Nejati-Javaremi, Ardeshir;Son, Jihyun;Lee, Deukhwan
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제29권12호
    • /
    • pp.1682-1687
    • /
    • 2016
  • The objective of this study was to estimate the genetic parameters of milk protein yields in Iranian Holstein dairy cattle. A total of 1,112,082 test-day milk protein yield records of 167,269 first lactation Holstein cows, calved from 1990 to 2010, were analyzed. Estimates of the variance components, heritability, and genetic correlations for milk protein yields were obtained using a random regression test-day model. Milking times, herd, age of recording, year, and month of recording were included as fixed effects in the model. Additive genetic and permanent environmental random effects for the lactation curve were taken into account by applying orthogonal Legendre polynomials of the fourth order in the model. The lowest and highest additive genetic variances were estimated at the beginning and end of lactation, respectively. Permanent environmental variance was higher at both extremes. Residual variance was lowest at the middle of the lactation and contrarily, heritability increased during this period. Maximum heritability was found during the 12th lactation stage ($0.213{\pm}0.007$). Genetic, permanent, and phenotypic correlations among test-days decreased as the interval between consecutive test-days increased. A relatively large data set was used in this study; therefore, the estimated (co)variance components for random regression coefficients could be used for national genetic evaluation of dairy cattle in Iran.

Semiparametric and Nonparametric Modeling for Matched Studies

  • Kim, In-Young;Cohen, Noah
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2003년도 추계 학술발표회 논문집
    • /
    • pp.179-182
    • /
    • 2003
  • This study describes a new graphical method for assessing and characterizing effect modification by a matching covariate in matched case-control studies. This method to understand effect modification is based on a semiparametric model using a varying coefficient model. The method allows for nonparametric relationships between effect modification and other covariates, or can be useful in suggesting parametric models. This method can be applied to examining effect modification by any ordered categorical or continuous covariates for which cases have been matched with controls. The method applies to effect modification when causality might be reasonably assumed. An example from veterinary medicine is used to demonstrate our approach. The simulation results show that this method, when based on linear, quadratic and nonparametric effect modification, can be more powerful than both a parametric multiplicative model fit and a fully nonparametric generalized additive model fit.

  • PDF

레이저 분말 베드 용융법으로 제조된 AlSi10Mg 합금의 경도 예측을 위한 설명 가능한 인공지능 활용 (Application of Explainable Artificial Intelligence for Predicting Hardness of AlSi10Mg Alloy Manufactured by Laser Powder Bed Fusion)

  • 전준협;서남혁;김민수;손승배;정재길;이석재
    • 한국분말재료학회지
    • /
    • 제30권3호
    • /
    • pp.210-216
    • /
    • 2023
  • In this study, machine learning models are proposed to predict the Vickers hardness of AlSi10Mg alloys fabricated by laser powder bed fusion (LPBF). A total of 113 utilizable datasets were collected from the literature. The hyperparameters of the machine-learning models were adjusted to select an accurate predictive model. The random forest regression (RFR) model showed the best performance compared to support vector regression, artificial neural networks, and k-nearest neighbors. The variable importance and prediction mechanisms of the RFR were discussed by Shapley additive explanation (SHAP). Aging time had the greatest influence on the Vickers hardness, followed by solution time, solution temperature, layer thickness, scan speed, power, aging temperature, average particle size, and hatching distance. Detailed prediction mechanisms for RFR are analyzed using SHAP dependence plots.

JAYA-GBRT model for predicting the shear strength of RC slender beams without stirrups

  • Tran, Viet-Linh;Kim, Jin-Kook
    • Steel and Composite Structures
    • /
    • 제44권5호
    • /
    • pp.691-705
    • /
    • 2022
  • Shear failure in reinforced concrete (RC) structures is very hazardous. This failure is rarely predicted and may occur without any prior signs. Accurate shear strength prediction of the RC members is challenging, and traditional methods have difficulty solving it. This study develops a JAYA-GBRT model based on the JAYA algorithm and the gradient boosting regression tree (GBRT) to predict the shear strength of RC slender beams without stirrups. Firstly, 484 tests are carefully collected and divided into training and test sets. Then, the hyperparameters of the GBRT model are determined using the JAYA algorithm and 10-fold cross-validation. The performance of the JAYA-GBRT model is compared with five well-known empirical models. The comparative results show that the JAYA-GBRT model (R2 = 0.982, RMSE = 9.466 kN, MAE = 6.299 kN, µ = 1.018, and Cov = 0.116) outperforms the other models. Moreover, the predictions of the JAYA-GBRT model are globally and locally explained using the Shapley Additive exPlanation (SHAP) method. The effective depth is determined as the most crucial parameter influencing the shear strength through the SHAP method. Finally, a Graphic User Interface (GUI) tool and a web application (WA) are developed to apply the JAYA-GBRT model for rapidly predicting the shear strength of RC slender beams without stirrups.

Application of random regression models for genetic analysis of 305-d milk yield over different lactations of Iranian Holsteins

  • Torshizi, Mahdi Elahi;Farhangfar, Homayoun;Mashhadi, Mojtaba Hosseinpour
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제30권10호
    • /
    • pp.1382-1387
    • /
    • 2017
  • Objective: During the last decade, genetic evaluation of dairy cows using longitudinal data (test day milk yield or 305-day milk yield) using random regression method has been officially adopted in several countries. The objectives of this study were to estimate covariance functions for genetic and permanent environmental effects and to obtain genetic parameters of 305-day milk yield over seven parities. Methods: Data including 60,279 total 305-day milk yield of 17,309 Iranian Holstein dairy cows in 7 parities calved between 20 to 140 months between 2004 and 2011. Residual variances were modeled by homogeneous and step functions with 7 and 10 classes. Results: The results showed that a third order polynomial for additive genetic and permanent environmental effects plus a step function with 10 classes for the residual variance was the most adequate and parsimonious model to describe the covariance structure of the data. Heritability estimates obtained by this model varied from 0.17 to 0.28. The performance of this model was better than repeatability model. Moreover, 10 classes of residual variance produce the more accurate result than 7 classes or homogeneous residual effect. Conclusion: A quadratic Legendre polynomial for additive genetic and permanent environmental effects with 10 step function residual classes are sufficient to produce a parsimonious model that explained the change in 305-day milk yield over consecutive parities of Iranian Holstein cows.