• Title/Summary/Keyword: 다중선형 회귀모형

Search Result 135, Processing Time 0.034 seconds

Effects of Multicollinearity in Logit Model (로짓모형에 있어서 다중공선성의 영향에 관한 연구)

  • Ryu, Si-Kyun
    • Journal of Korean Society of Transportation
    • /
    • v.26 no.1
    • /
    • pp.113-126
    • /
    • 2008
  • This research aims to explore the effects of multicollinearity on the reliability and goodness of fit of logit model. To investigate the effects of multicollinearity on the multinominal logit model, numerical experiments are performed. The exploratory variables(attributes of utility functions) which have a certain degree of correlations from (rho=) 0.0 to (rho=) 0.9 are generated and rho-squares and t-statistics which are the indices of goodness of fit and reliability of logit model are traced. From the well designed numerical experiments, following findings are validated : 1) When a new exploratory variable is added, some of rho-squares increase while the others decrease. 2) The higher relations between generic variables lead a logit model worse with respect to goodness of fit. 3) Multicollinearity has a tendency to produce over-evaluated parameters. 4) The reliability of the estimated parameter has a tendency to decrease when the correlations between attributes are high. These results suggest that we have to examine the existence of multicollinearity and perform the proper treatments to diminish multicollinearity when we develop logit model.

Development of Naïve-Bayes classification and multiple linear regression model to predict agricultural reservoir storage rate based on weather forecast data (기상예보자료 기반의 농업용저수지 저수율 전망을 위한 나이브 베이즈 분류 및 다중선형 회귀모형 개발)

  • Kim, Jin Uk;Jung, Chung Gil;Lee, Ji Wan;Kim, Seong Joon
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.10
    • /
    • pp.839-852
    • /
    • 2018
  • The purpose of this study is to predict monthly agricultural reservoir storage by developing weather data-based Multiple Linear Regression Model (MLRM) with precipitation, maximum temperature, minimum temperature, average temperature, and average wind speed. Using Naïve-Bayes classification, total 1,559 nationwide reservoirs were classified into 30 clusters based on geomorphological specification (effective storage volume, irrigation area, watershed area, latitude, longitude and frequency of drought). For each cluster, the monthly MLRM was derived using 13 years (2002~2014) meteorological data by KMA (Korea Meteorological Administration) and reservoir storage rate data by KRC (Korea Rural Community). The MLRM for reservoir storage rate showed the determination coefficient ($R^2$) of 0.76, Nash-Sutcliffe efficiency (NSE) of 0.73, and root mean square error (RMSE) of 8.33% respectively. The MLRM was evaluated for 2 years (2015~2016) using 3 months weather forecast data of GloSea5 (GS5) by KMA. The Reservoir Drought Index (RDI) that was represented by present and normal year reservoir storage rate showed that the ROC (Receiver Operating Characteristics) average hit rate was 0.80 using observed data and 0.73 using GS5 data in the MLRM. Using the results of this study, future reservoir storage rates can be predicted and used as decision-making data on stable future agricultural water supply.

Bias Correction of AMSR2 Soil Moisture Data Using a Multiple Regression Method (다중회귀모형을 이용한 AMSR2 토양수분의 정량적 개선)

  • Kim, Myojeong;Kim, Gwangseob
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2015.05a
    • /
    • pp.514-514
    • /
    • 2015
  • 홍수 예측의 개선에 있어 정확한 공간 토양수분 정보는 필수적이다. 위성관측을 활용한 토양수분관측이 이루어지고 있으나 실제적 토양수분 상태와 정량적 차이가 크므로 편이보정을 통한 정량적 개선과정이 요구되는 실정이다. 따라서, 본 연구에서는 위성에서 관측한 AMSR2 토양수분과 지상관측 토양수분자료 및 다중회귀모형를 이용하여 토양수분자료를 정량적로 개선하였다. 공간 해상도가 10 km인 AMSR2 토양수분을 1 km로 상세화한 우리나라 전역의 토양수분 자료와 수자원관리종합정보시스템(WAMIS)에서 제공하는 강우관측소 556개 지점에서 관측한 강우자료, 후처리한 MODIS LST 자료, 증발산량 및 식생지수를 사용하였다. 2012년 7월부터 2013년까지 기상청 농업기상관측관서에서 관측하는 지점 중 사용 가능한 6개 토양수분관측소 자료에 대해 토양군별회귀계수를 산정하였다. 토양군별 다중회귀모형을 이용하여 편이보정한 토양수분자료는 전반적으로 과소추정되는 AMSR2 토양수분의 단점을 개선하여 위성관측 토양수분자료의 활용성을 개선하였다(Fig. 1).

  • PDF

Development of Accident Forecasting Models in Freeway Tunnels using Multiple Linear Regression Analysis (다중선형 회귀분석을 이용한 고속도로 터널구간의 교통사고 예측모형 개발)

  • Park, Ju-Hwan;Kim, Sang-Gu
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.6
    • /
    • pp.145-154
    • /
    • 2012
  • This paper analyzed the characteristics of traffic accidents in all tunnels on nationwide freeways and selected some various independent variables related to accident occurrence in tunnels. The study aims to develop reliable accident forecasting models using the various dependent variables such as the number of accident (no.), no./km, and no./MVK. Finally, reliable multiple linear regression models were proposed in this paper. This study tested the validity verification of developed models through statistics such as $R^2$, F values, multicollinearity, residual analysis. The paper selected the accident forecasting models considering the characteristics of tunnel accidents and two models were finally proposed according to two groups of tunnel length. In the selected models, natural logarithm of ln(no./MVK) is used for the dependent variable and AADT, vertical slope, and tunnel hight are used for the independent variables. The reliability of two models was proved by the comparison analysis between field data and estimating data using RMSE and MAE. These models may be not only effective in evaluating tunnel safety under design and planning phases of tunnel but also useful to reduce traffic accidents in tunnels and to manage the traffic flow of tunnel.

Characteristics and Models of the Side-swipe Accident in the Case of Cheongju 4-legged Signalized Intersections (4지 신호교차로의 측면접촉사고 특성 및 사고모형 - 청주시를 사례로 -)

  • Park, Sang-Hyuk;Kim, Tae-Young;Park, Byung-Ho
    • International Journal of Highway Engineering
    • /
    • v.11 no.4
    • /
    • pp.41-47
    • /
    • 2009
  • This study deals with the side-swipe accidents of 4-legged signalized intersections in Cheongju. The objectives are to analyze the characteristics of the accidents and to develop the related models. In pursuing the above, this study gives particular emphasis to finding the appropriate methodology to modelling. The main results are as follows. First, injuries were analyzed to be twice than property-only accidents in the side-swipe accidents. The accidents were evaluated to occur more in inside-intersection. Also, the accidents were analyzed to be almost the auto-related accidents and to be occurred by the unsafely-driving activity. Second, multiple linear regression models were evaluated to be more statistically significant than multiple non-linear. The most fitted models were analyzed to be the models with the number of accidents as the dependent variable. The factors of side-swipe accidents analyzed in this study were ADT, area of intersection, right-turn-only-lane, number of pedestrian crossings, limited speed of main road, maximum grade and number of signal phase.

  • PDF

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.

Autocovariance based estimation in the linear regression model (선형회귀 모형에서 자기공분산 기반 추정)

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.839-847
    • /
    • 2011
  • In this study, we derive an estimator based on autocovariance for the regression coefficients vector in the multiple linear regression model. This method is suggested by Park (2009), and although this method does not seem to be intuitively attractive, this estimator is unbiased for the regression coefficients vector. When the vectors of exploratory variables satisfy some regularity conditions, under mild conditions which are satisfied when errors are from autoregressive and moving average models, this estimator has asymptotically the same distribution as the least squares estimator and also converges in probability to the regression coefficients vector. Finally we provide a simulation study that the forementioned theoretical results hold for small sample cases.

Estimation of Maximum Fresh Snow Depth using Regression Analysis (회귀분석을 이용한 최심신적설 추정식 개발)

  • Park, Heeseong;Chung, Gunhui
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2016.05a
    • /
    • pp.205-205
    • /
    • 2016
  • 우리나라의 겨울철 자연재해 중 대설에 의한 피해가 발생하는 빈도가 증가하고 있는 가운데 그 피해를 예측하고 대비하기 위한 연구들이 다수 진행되고 있다. 강설은 일단위로 측정하며, 매일 새롭게 내린 강설의 양인 최심신적설과 기존에 녹지 않고 쌓여 있던 깊이까지를 고려한 최심적설로 구분된다. 우리나라의 경우에는 갑작스럽게 내린 폭설에 의한 피해가 대부분이므로 최심신적설량을 예측하는 것이 매우 중요하다. 이에 본 연구에서는 다중회귀분석을 이용해 우리나라의 최심신적설량을 추정하기 위한 식을 개발하였다. 다중회귀분석을 위한 독립변수로는 해당 일에 예측된 강수량, 일평균기온, 일최고기온, 일최저기온을 사용하였으며, 강수량과 일평균기온의 상호작용을 고려할 수 있도록 모형을 구성하였다. 모형의 개발에는 전국 74개 기상관측소의 최심신적설 자료를 관측소 단위로 전체 자료의 2/3을 무작위로 추출하여 이용하였으며, 추출되지 않고 남은 1/3의 자료를 이용해 모형에 대한 검증을 실시하였다. 그 결과 상호작용항이 포함되지 않은 다중선형회귀모형에 비해 상호작용을 고려한 다중회귀모형의 예측력이 훨씬 우수하게 나타났다. 강수량과 기온이 정확하게 예측된다면 개발된 추정식을 이용해 간편하게 최심신적설량을 예측할 수 있어, 폭설에 대한 대비에 활용할 수 있을 것으로 판단된다.

  • PDF

A Correction of East Asian Summer Precipitation Simulated by PNU/CME CGCM Using Multiple Linear Regression (다중 선형 회귀를 이용한 PNU/CME CGCM의 동아시아 여름철 강수예측 보정 연구)

  • Hwang, Yoon-Jeong;Ahn, Joong-Bae
    • Journal of the Korean earth science society
    • /
    • v.28 no.2
    • /
    • pp.214-226
    • /
    • 2007
  • Because precipitation is influenced by various atmospheric variables, it is highly nonlinear. Although precipitation predicted by a dynamic model can be corrected by using a nonlinear Artificial Neural Network, this approach has limits such as choices of the initial weight, local minima and the number of neurons, etc. In the present paper, we correct simulated precipitation by using a multiple linear regression (MLR) method, which is simple and widely used. First of all, Ensemble hindcast is conducted by the PNU/CME Coupled General Circulation Model (CGCM) (Park and Ahn, 2004) for the period from April to August in 1979-2005. MLR is applied to precipitation simulated by PNU/CME CGCM for the months of June (lead 2), July (lead 3), August (lead 4) and seasonal mean JJA (from June to August) of the Northeast Asian region including the Korean Peninsula $(110^{\circ}-145^{\circ}E,\;25-55^{\circ}N)$. We build the MLR model using a linear relationship between observed precipitation and the hindcasted results from the PNU/CME CGCM. The predictor variables selected from CGCM are precipitation, 500 hPa vertical velocity, 200 hPa divergence, surface air temperature and others. After performing a leave-oneout cross validation, the results are compared with the PNU/CME CGCM's. The results including Heidke skill scores demonstrate that the MLR corrected results have better forecasts than the direct CGCM result for rainfall.

Estimation of Spatio-temporal soil moisture and drought index based on MODIS multi-satellite images (MODIS 다중 위성영상 기반의 토양수분 및 가뭄지수 산정연구)

  • Chung, Jeehun;Kim, Juyeon;Kim, Hyeongseok;Jeong, Daeun;Kim, Seongjoon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.446-446
    • /
    • 2022
  • 본 연구에서는 MODIS(MODerate resolution Imaging Spectroradiometer) 다중 위성영상을 기반으로 전국 시공간 토양수분 및 토양수분 기반의 가뭄지수 SWDI(Soil Water Deficit Index)를 산정하였다. 시공간 토양수분의 산정을 위해 입력자료로 MODIS 위성의 지표면온도(Land Surface Temperature, LST), 증발산 및 식생(Enhanced Vegetation Index, EVI; Fraction of Photosynthetically Active Radiation, FPAR; Leaf Area Index, LAI; Normalized Difference Vegetation Index, NDVI) 관련 산출물 자료와 지상 관측자료인 일 단위 강수량 자료를 구축하였다. MODIS 위성영상은 산출물별로 제공되는 QC(Quality Control) 영상을 활용해 보정을 수행하였고, 공간 강수량 자료는 기상청에서 제공하는 전국 92개 지점의 종관기상관측자료를 구축하여 공간보간기법인 역거리가중법을 적용해 생성하였다. 실측 토양수분은 농촌진흥청에서 제공하는 76개 지점의 토양 깊이 10 cm에 설치된 TDR(Time Domain Reflectomerty) 센서에서 측정된 토양수분 자료를 활용하였으며, 토양수분 모의 시 토양 속성을 고려하기 위해 국립농업과학원에서 제공하는 토양도를 구축하여 활용하였다. 토양수분 산정 모형은 다중선형회귀모형(Multiple Linear Regression Model, MLRM)을 활용하였으며, 계절 및 토성에 따른 회귀식을 산정하였다. 회귀식 기반의 토양수분과 토성별 포장용수량 및 영구위조점 값을 이용하여 SWDI를 산정하고, 실제 가뭄 발생 시기 및 지역과의 비교하고자 한다.

  • PDF