• Title/Summary/Keyword: 다변수 선형회귀분석

Search Result 20, Processing Time 0.022 seconds

Multivariate Analysis for Clinicians (임상의를 위한 다변량 분석의 실제)

  • Oh, Joo Han;Chung, Seok Won
    • Clinics in Shoulder and Elbow
    • /
    • v.16 no.1
    • /
    • pp.63-72
    • /
    • 2013
  • In medical research, multivariate analysis, especially multiple regression analysis, is used to analyze the influence of multiple variables on the result. Multiple regression analysis should include variables in the model and the problem of multi-collinearity as there are many variables as well as the basic assumption of regression analysis. The multiple regression model is expressed as the coefficient of determination, $R^2$ and the influence of independent variables on result as a regression coefficient, ${\beta}$. Multiple regression analysis can be divided into multiple linear regression analysis, multiple logistic regression analysis, and Cox regression analysis according to the type of dependent variables (continuous variable, categorical variable (binary logit), and state variable, respectively), and the influence of variables on the result is evaluated by regression coefficient${\beta}$, odds ratio, and hazard ratio, respectively. The knowledge of multivariate analysis enables clinicians to analyze the result accurately and to design the further research efficiently.

Prediction of Gas Chromatographic Retention Times of PAH Using QSRR (기체크로마토그래피에서 QSRR을 통한 PAH 용리시간 예측)

  • Kim, Young Gu
    • Journal of the Korean Chemical Society
    • /
    • v.45 no.5
    • /
    • pp.422-428
    • /
    • 2001
  • Retention relative times(RRTs) of PAH molecules and their derivatives in gas chromatography are trained and predicted in testing sets using a multiple linear regression(MLR) and an artificial neural network(ANN). The main descriptors of PAHs and their derivatives in QSRR are the square root of molecular weight(sqmw), molecular connectivity($^1{\chi}_v$), molecular dipole moment(D) and length-to-breadth ratios(L/B). The results of MLR shows that a heavy molecule has a propensity for long retention time. L/B closely related with slot model is a good descriptor in MLR. On the other hand, ANN which is not effected by the linear dependencies among the descriptors were exclusively based on molecular weight and molecular dipole moment. The variances which shows the accuracy of prediction for retention times in testing sets are 1.860, 0.206 for MLR and ANN, respectively. It was shown that ANN can exceed the MLR in prediction accuracy.

  • PDF

Evaluation of applicability of pan coefficient estimation method by multiple linear regression analysis (다변량 선형회귀분석을 이용한 증발접시계수 산정방법 적용성 검토)

  • Rim, Chang-Soo
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.3
    • /
    • pp.229-243
    • /
    • 2022
  • The effects of monthly meteorological data measured at 11 stations in South Korea on pan coefficient were analyzed to develop the four types of multiple linear regression models for estimating pan coefficients. To evaluate the applicability of developed models, the models were compared with six previous models. Pan coefficients were most affected by air temperature for January, February, March, July, November and December, and by solar radiation for other months. On the whole, for 12 months of the year, the effects of wind speed and relative humidity on pan coefficient were less significant, compared with those of air temperature and solar radiation. For all meteorological stations and months, the model developed by applying 5 independent variables (wind speed, relative humidity, air temperature, ratio of sunshine duration and daylight duration, and solar radiation) for each station was the most effective for evaporation estimation. The model validation results indicate that the multiple linear regression models can be applied to some particular stations and months.

Multivariate quantile regression tree (다변량 분위수 회귀나무 모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.3
    • /
    • pp.533-545
    • /
    • 2017
  • Quantile regression models provide a variety of useful statistical information by estimating the conditional quantile function of the response variable. However, the traditional linear quantile regression model can lead to the distorted and incorrect results when analysing real data having a nonlinear relationship between the explanatory variables and the response variables. Furthermore, as the complexity of the data increases, it is required to analyse multiple response variables simultaneously with more sophisticated interpretations. For such reasons, we propose a multivariate quantile regression tree model. In this paper, a new split variable selection algorithm is suggested for a multivariate regression tree model. This algorithm can select the split variable more accurately than the previous method without significant selection bias. We investigate the performance of our proposed method with both simulation and real data studies.

How to Measure Nonlinear Dependence in Hydrologic Time Series (시계열 수문자료의 비선형 상관관계)

  • Mun, Yeong-Il
    • Journal of Korea Water Resources Association
    • /
    • v.30 no.6
    • /
    • pp.641-648
    • /
    • 1997
  • Mutual information is useful for analyzing nonlinear dependence in time series in much the same way as correlation is used to characterize linear dependence. We use multivariate kernel density estimators for the estimation of mutual information at different time lags for single and multiple time series. This approach is tested on a variety of hydrologic data sets, and suggested an appropriate delay time $ au$ at which the mutual information is almost zerothen multi-dimensional phase portraits could be constructed from measurements of a single scalar time series.

  • PDF

Penalized least distance estimator in the multivariate regression model (다변량 선형회귀모형의 벌점화 최소거리추정에 관한 연구)

  • Jungmin Shin;Jongkyeong Kang;Sungwan Bang
    • The Korean Journal of Applied Statistics
    • /
    • v.37 no.1
    • /
    • pp.1-12
    • /
    • 2024
  • In many real-world data, multiple response variables are often dependent on the same set of explanatory variables. In particular, if several response variables are correlated with each other, simultaneous estimation considering the correlation between response variables might be more effective way than individual analysis by each response variable. In this multivariate regression analysis, least distance estimator (LDE) can estimate the regression coefficients simultaneously to minimize the distance between each training data and the estimates in a multidimensional Euclidean space. It provides a robustness for the outliers as well. In this paper, we examine the least distance estimation method in multivariate linear regression analysis, and furthermore, we present the penalized least distance estimator (PLDE) for efficient variable selection. The LDE technique applied with the adaptive group LASSO penalty term (AGLDE) is proposed in this study which can reflect the correlation between response variables in the model and can efficiently select variables according to the importance of explanatory variables. The validity of the proposed method was confirmed through simulations and real data analysis.

Locally Weighted Polynomial Forecasting Model (지역가중다항식을 이용한 예측모형)

  • Mun, Yeong-Il
    • Journal of Korea Water Resources Association
    • /
    • v.33 no.1
    • /
    • pp.31-38
    • /
    • 2000
  • Relationships between hydrologic variables are often nonlinear. Usually the functional form of such a relationship is not known a priori. A multivariate, nonparametric regression methodology is provided here for approximating the underlying regression function using locally weighted polynomials. Locally weighted polynomials consider the approximation of the target function through a Taylor series expansion of the function in the neighborhood of the point of estimate. The utility of this nonparametric regression approach is demonstrated through an application to nonparametric short term forecasts of the biweekly Great Salt Lake volume.volume.

  • PDF

Comparison of Principal Component Regression and Nonparametric Multivariate Trend Test for Multivariate Linkage (다변량 형질의 유전연관성에 대한 주성분을 이용한 회귀방법와 다변량 비모수 추세검정법의 비교)

  • Kim, Su-Young;Song, Hae-Hiang
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.1
    • /
    • pp.19-33
    • /
    • 2008
  • Linear regression method, proposed by Haseman and Elston(1972), for detecting linkage to a quantitative trait of sib pairs is a linkage testing method for a single locus and a single trait. However, multivariate methods for detecting linkage are needed, when information from each of several traits that are affected by the same major gene are available on each individual. Amos et al. (1990) extended the regression method of Haseman and Elston(1972) to incorporate observations of two or more traits by estimating the principal component linear function that results in the strongest correlation between the squared pair differences in the trait measurements and identity by descent at a marker locus. But, it is impossible to control the probability of type I errors with this method at present, since the exact distribution of the statistic that they use is yet unknown. In this paper, we propose a multivariate nonparametric trend test for detecting linkage to multiple traits. We compared with a simulation study the efficiencies of multivariate nonparametric trend test with those of the method developed by Amos et al. (1990) for quantitative traits data. For multivariate nonparametric trend test, the results of the simulation study reveal that the Type I error rates are close to the predetermined significance levels, and have in general high powers.

Coastal Shallow-Water Bathymetry Survey through a Drone and Optical Remote Sensors (드론과 광학원격탐사 기법을 이용한 천해 수심측량)

  • Oh, Chan Young;Ahn, Kyungmo;Park, Jaeseong;Park, Sung Woo
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.29 no.3
    • /
    • pp.162-168
    • /
    • 2017
  • Shallow-water bathymetry survey has been conducted using high definition color images obtained at the altitude of 100 m above sea level using a drone. Shallow-water bathymetry data are one of the most important input data for the research of beach erosion problems. Especially, accurate bathymetry data within closure depth are critically important, because most of the interesting phenomena occur in the surf zone. However, it is extremely difficult to obtain accurate bathymetry data due to wave-induced currents and breaking waves in this region. Therefore, optical remote sensing technique using a small drone is considered to be attractive alternative. This paper presents the potential utilization of image processing algorithms using multi-variable linear regression applied to red, green, blue and grey band images for estimating shallow water depth using a drone with HD camera. Optical remote sensing analysis conducted at Wolpo beach showed promising results. Estimated water depths within 5 m showed correlation coefficient of 0.99 and maximum error of 0.2 m compared with water depth surveyed through manual as well as ship-board echo-sounder measurements.

Analysis of Factors Influencing upon the Metro Wear Using the Classification and Regression Trees (CART 분석을 이용한 지하철 마모 영향인자 분석)

  • Jeong, Min Chul;Lee, Won Woo;Kim, Jung Hoon;Kong, Jung Sik
    • 한국방재학회:학술대회논문집
    • /
    • 2011.02a
    • /
    • pp.38-38
    • /
    • 2011
  • 일반적으로 레일마모는 열차의 주행안전 및 승차감에 미치는 영향이 크고, 소음 진동의 주요원인으로 작용한다. 또한 레일마모가 발생할 경우 궤도구조의 파괴를 촉진시킴으로써 차량 및 궤도유지보수비를 크게 증가시킨다. 따라서 구간 특성 및 환경 영향 인자 등 현장에서 발생하는 마모 원인을 체계적으로 분석함으로써 마모를 저감할 수 있도록 차량운행 조건과 선로선형 및 궤도구조를 설계하는 것은 중요한 과제이다. CART(Classification And Regression Tree; 분류와 회귀나무) 분석은 패키지화된 좋은 분류 및 예측도구 기법으로 나무의 상위 분리수준에서 일반적으로 나타나는 가장 중요한 입력변수들을 사용하는 등의 입력변수를 선정하는 경우 매우 유용하다. 본 연구에서는 다변수 구간특성 및 환경인자를 고려한 검측 자료 상관관계 분석을 위한 회귀 나무기반 모델(TBM: Tree Based Model) 분석 수행을 위해 지하철 2호선 마모 데이터와 마모 데이터에 영향을 미치는 각종 다변수 구간특성 및 환경인자를 사용하였다. 2호선 지하철의 구간특성 인자 및 환경인자는 레일의 종류, 레일의 위치, 도상, 곡률반경, 캔트 슬랙 및 운행 일수 등으로 구분하였다. 레일의 종류는 ks-50kg과 ks-60kg 두 종류의 레일이 있으며, 레일의 위치는 지상과 지하로 크게 구분할 수 있다. 도상은 콘크리트 도상, 자갈 도상과 일부 구간의 방진상 콘크리트 도상으로 구분할 수 있으며, 곡률반경은 직선구간과 완화곡선 구간 및 최소 250m부터 627m까지 분포된 원 곡선 구간으로 구분할 수 있다. 캔트 간격은 최소 96cm 부터 120cm 간격으로 구분하며, 슬랙은 5~9cm에 분포하고, 운행 기간은 해당 기간 동안 유지보수 이력이 없는 구간을 선정하여 2005년부터 2006년까지 4번에 걸쳐 검측된 지하철 2호선 내선 마모데이터를 사용하였다. 총 X1부터 X7까지 총 7개의 구간특성 또는 환경특성을 영향인자로 선정하였으며, 이러한 영향인자에 의해 결정되는 종속 인자로 Y1인 직마모와 Y2인 측마모를 선정하여 이 중 실질적으로 지하철 궤도의 성능 평가에 주요 판단인자로 사용되는 측마모와 구간특성 및 환경영향인자와의 상관관계 분석을 수행하였다. 해당 마모 데이터가 검측되는 기간 동안 유지보수 이력이 없는 12272 point의 데이터를 검출하였고 CART 프로그램을 이용하여 데이터를 분석하였으며, CART 프로그램의 해석을 위해 종속변수인 직마모량은 각 검측 지점의 마모량에 해당하는 등급으로 변환하여 분석을 수행하였다. 레일의 마모에 영향을 미치는 구간특성 및 환경인자와 종속 변수로 사용된 레일의 마모량 사이의 CART를 이용한 상관관계 분석은 실제 구조물에서 영향인자간의 상관 관계와 유사하며, 추후 연구에서는 이를 바탕으로 하여 정량화된 검측 데이터를 종속변수로 하여 구간특성 또는 환경인자 등 외부 영향인자를 고려한 궤도 검측데이터와의 상관관계 분석을 수행할 계획이다.

  • PDF