• Title/Summary/Keyword: 다중 회귀

Search Result 3,935, Processing Time 0.027 seconds

A Study on Estimation of Lowflow Ungauged Basin Using Multiple Regression Analysis (다중회귀분석을 이용한 미계측 유역의 갈수유량 산정에 관한 연구)

  • Lim, Ga Kyun;Jeung, Se Jin;Kim, Byung Sik
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2020.06a
    • /
    • pp.133-133
    • /
    • 2020
  • 갈수량이란 1년 중 355일은 유지되는 유량을 말하며 물 공급 계획 및 관리, 저수지 설계, 관개용수의 수량과 수질 관리, 생태계 보존 등에 있어서 갈수량의 크기와 빈도를 파악하는 것은 매우 중요한 과정이다. 갈수량 산정을 위해서는 오랜 기간의 관측 일유량 자료가 필요하지만 우리나라의 경우 관측 유량 자료의 결측자료가 많아 갈수량 산정에 필요한 장기간의 자료가 부족하다. 따라서 본 연구에서는 전국 40개 중권역 유역을 대상으로 갈수 빈도별 갈수량 산정 회귀식 개발을 수행하였다. 갈수량 산정에 적용할 수 있는 18개의 유역인자와 4개의 수문 인자를 상관분석을 통해 다중공선성을 고려하였으며 상관분석 결과를 토대로 미계측 유역에 적용 가능한 인자를 선정하였다. 갈수 빈도 분석과 단계적 회귀분석을 통하여 미계측 유역에 적용할 수 있는 갈수 빈도별 갈수량 산정 회귀식을 개발하였다. 또한 계측 유역을 미계측 유역으로 가정하여 개발된 갈수량 산정 회귀식을 이용하여 갈수량을 산정하고 분석 결과와 실제 갈수량을 비교하여 개발된 회귀식의 적정성을 검토하였다.

  • PDF

Comparison of GEE Estimation Methods for Repeated Binary Data with Time-Varying Covariates on Different Missing Mechanisms (시간-종속적 공변량이 포함된 이분형 반복측정자료의 GEE를 이용한 분석에서 결측 체계에 따른 회귀계수 추정방법 비교)

  • Park, Boram;Jung, Inkyung
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.5
    • /
    • pp.697-712
    • /
    • 2013
  • When analyzing repeated binary data, the generalized estimating equations(GEE) approach produces consistent estimates for regression parameters even if an incorrect working correlation matrix is used. However, time-varying covariates experience larger changes in coefficients than time-invariant covariates across various working correlation structures for finite samples. In addition, the GEE approach may give biased estimates under missing at random(MAR). Weighted estimating equations and multiple imputation methods have been proposed to reduce biases in parameter estimates under MAR. This article studies if the two methods produce robust estimates across various working correlation structures for longitudinal binary data with time-varying covariates under different missing mechanisms. Through simulation, we observe that time-varying covariates have greater differences in parameter estimates across different working correlation structures than time-invariant covariates. The multiple imputation method produces more robust estimates under any working correlation structure and smaller biases compared to the other two methods.

Principal Components Regression in Logistic Model (로지스틱모형에서의 주성분회귀)

  • Kim, Bu-Yong;Kahng, Myung-Wook
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.571-580
    • /
    • 2008
  • The logistic regression analysis is widely used in the area of customer relationship management and credit risk management. It is well known that the maximum likelihood estimation is not appropriate when multicollinearity exists among the regressors. Thus we propose the logistic principal components regression to deal with the multicollinearity problem. In particular, new method is suggested to select proper principal components. The selection method is based on the condition index instead of the eigenvalue. When a condition index is larger than the upper limit of cutoff value, principal component corresponding to the index is removed from the estimation. And hypothesis test is sequentially employed to eliminate the principal component when a condition index is between the upper limit and the lower limit. The limits are obtained by a linear model which is constructed on the basis of the conjoint analysis. The proposed method is evaluated by means of the variance of the estimates and the correct classification rate. The results indicate that the proposed method is superior to the existing method in terms of efficiency and goodness of fit.

Development of Empirical Formulas for Storage Function Method (저류함수법의 매개변수 산정식 개발)

  • Choi, Jong-Nam;Ahn, Won-Shik;Kim, Tae-Gyun;Chung, Gun-Hui
    • Journal of the Korean Society of Hazard Mitigation
    • /
    • v.9 no.5
    • /
    • pp.125-130
    • /
    • 2009
  • Storage function method which considers the non-linearity of the relationship between rainfall and runoff has been frequently used to predict runoff in a basin and a flood pattern. However, it is time-consuming to estimate appropriate parameters of every basin and rainfall event, which requires the empirical parameter equation applicable in Korea. In this study, multiple regression analysis is used to develop empirical equations to estimate parameters of Storage Function method using basin characteristics. The basin area, maximum stream length, and stream slope are considered as the basin characteristics as the result of the regression analysis. Collinearity is removed and trial-and-error method is used to choose the most descriptive parameters to the dependent variables in Han River basin which is divided into 30 subbasins. The developed equations are validated using the rainfall events in MunMak gauging station and named as 'Han River equation'. The equation could provide the useful information about Storage Function method parameter to calculate runoff from a basin and predict river stage.

Relationship Between Construction Productivity and the Weather Elements in the Reinforced Concrete Structure for the High-rise Apartment Buildings (기후요소와 생산성간의 상관관계 분석에 관한 연구 - 공동주택 철근콘크리트 골조공사를 중심으로 -)

  • Kim Shin-Tae;Kim Yea-Sang;Chin Sang-yoon
    • Korean Journal of Construction Engineering and Management
    • /
    • v.5 no.6 s.22
    • /
    • pp.80-89
    • /
    • 2004
  • Among the various factors influencing construction productivity, weather conditions or elements become very important factors in planning and executing construction project. It is especially true in Korea where the weather changes dramatically through few seasons. In this study, relationship between construction productivity of the reinforced concrete structure we for the high-rise apartment buildings and 5 weather elements including temperature, humidity, day time, rainfall, and wind velocity have been analyzed The results trough regression analysis showed that weather elements explain $58.8\%$ of productivity in total and temperature and day time were more important factors among them.

Shrinkage Structure of Ridge Partial Least Squares Regression

  • Kim, Jong-Duk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.327-344
    • /
    • 2007
  • Ridge partial least squares regression (RPLS) is a regression method which can be obtained by combining ridge regression and partial least squares regression and is intended to provide better predictive ability and less sensitive to overfitting. In this paper, explicit expressions for the shrinkage factor of RPLS are developed. The structure of the shrinkage factor is explored and compared with those of other biased regression methods, such as ridge regression, principal component regression, ridge principal component regression, and partial least squares regression using a near infrared data set.

  • PDF

The Estimation of Software Development Effort Using Multiple Regression Method (다중회귀 분석을 이용한 소프트웨어 개발노력추정)

  • Jung Hye-Jung;Yang Hae-Sool;Shin Seok-Kyoo;Lee Sang-Un
    • The KIPS Transactions:PartD
    • /
    • v.11D no.7 s.96
    • /
    • pp.1483-1490
    • /
    • 2004
  • To accomplish a project successfuly, we have to estimate develpment effort accurately. But, development effort is different to software size and operation environment. Usually, we made use of function point for estimating development effort. In this paper. we make use of 789 project data. It is related to development projects in 1990`s. We investigate the variable affecting development effort. Also, we exedcute multiple liner regression analysis for looking linear relation about variables. We find the regression equation for multistage by dividing PDR that influ-enced development effort step by step.

Non-Response Imputation for Panel Data (패널자료의 무응답 대체법)

  • Pak, Gi-Deok;Shin, Key-Il
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.6
    • /
    • pp.899-907
    • /
    • 2010
  • Several non-response imputation methods are suggested, however, mainly cross-sectional imputations are studied and applied to this analysis. A simple and common imputation method for panel data is the cross-wave regression imputation or carry-over imputation as a special case of cross-wave regression imputation. This study suggests a multiple imputation method combined time series analysis and cross-sectional multiple imputation method. We compare this method and the cross-wave regression imputation method using MSE, MAE, and Bias. The 2008 monthly labor survey data is used for this study.

Development of a Daily Snowmelt Depth Model using Multiple Linear Regression (다중회귀모형을 활용한 일 단위 융설 깊이 예측 모형 개발)

  • Oh, Yeoung Rok;Lee, Gyumin;Shin, Hyungjin;Jun, Kyung Soo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.374-374
    • /
    • 2021
  • 최근 우리나라에도 대설로 인한 피해가 발생하고 있으며, 피해의 대부분은 강설 발생 이후 남아 있는 적설량이 주된 원인이 되고 있다. 적설량에 대한 예측은 대설피해에 대응하기 위한 중요한 정보이다. 따라서 본 연구에서는 융설량에 영향을 미칠것으로 판단되는 적설량, 기온, 습도, 일사량을 반영하여 일일 융설량을 모의하는 다중회귀모형을 구성하였다. 모형은 2000년부터 2020년까지의 강설 사상을 대상으로 구축하였으며, 2021년에 발생한 광주, 대관령, 목포, 서산, 전주 지역의 강설 사상에 적용하였다. 분석 대상 지역의 평균 적설량은 7.41 cm로 나타났으며, 평균 RMSE는 1.64 cm가 발생하였다. 오차의 원인으로는 적설량이 1 cm 미만 감소했을 경우, 바람이나 승화의 영향이 상대적으로 크게 작용할 수 있으나, 본 연구에 이용된 함수는 바람과 증발산 등이 고려되지 않았다. 또한, 회귀계수 결정에서 급격한 온도 변화를 능동적으로 반영하기 어려워 급상승한 온도나 매우 낮은 온도에 오차가 더 크게 나타난다. 따라서, 본 함수를 통하여 융설 깊이를 예측하기 위해서는 매우 높은 온도나, 매우 낮은 온도에서의 영향을 통제할 수 있는 변수 또는 상수를 추가할 필요가 있는 것으로 판단된다. 또한 초기 강설 당시의 기온과 습도 등에 따라, 눈의 결정이 달라지고, 이에 따라 융설에도 영향을 미칠 수 있다는 점을 이해하여, 초기 적설에 대한 변수도 고려되어야 할 것이다.

  • PDF

Analysis of AI interview data using unified non-crossing multiple quantile regression tree model (통합 비교차 다중 분위수회귀나무 모형을 활용한 AI 면접체계 자료 분석)

  • Kim, Jaeoh;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.6
    • /
    • pp.753-762
    • /
    • 2020
  • With an increasing interest in integrating artificial intelligence (AI) into interview processes, the Republic of Korea (ROK) army is trying to lead and analyze AI-powered interview platform. This study is to analyze the AI interview data using a unified non-crossing multiple quantile tree (UNQRT) model. Compared to the UNQRT, the existing models, such as quantile regression and quantile regression tree model (QRT), are inadequate for the analysis of AI interview data. Specially, the linearity assumption of the quantile regression is overly strong for the aforementioned application. While the QRT model seems to be applicable by relaxing the linearity assumption, it suffers from crossing problems among estimated quantile functions and leads to an uninterpretable model. The UNQRT circumvents the crossing problem of quantile functions by simultaneously estimating multiple quantile functions with a non-crossing constraint and is robust from extreme quantiles. Furthermore, the single tree construction from the UNQRT leads to an interpretable model compared to the QRT model. In this study, by using the UNQRT, we explored the relationship between the results of the Army AI interview system and the existing personnel data to derive meaningful results.