• Title/Summary/Keyword: 회귀법

Search Result 1,737, Processing Time 0.033 seconds

Smoothing parameter selection in semi-supervised learning (준지도 학습의 모수 선택에 관한 연구)

  • Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.4
    • /
    • pp.993-1000
    • /
    • 2016
  • Semi-supervised learning makes it easy to use an unlabeled data in the supervised learning such as classification. Applying the semi-supervised learning on the regression analysis, we propose two methods for a better regression function estimation. The proposed methods have been assumed different marginal densities of independent variables and different smoothing parameters in unlabeled and labeled data. We shows that the overfitted pilot estimator should be used to achieve the fastest convergence rate and unlabeled data may help to improve the convergence rate with well estimated smoothing parameters. We also find the conditions of smoothing parameters to achieve optimal convergence rate.

A Study on the Treatment of Uncertainty in Linear Regression Method for Chemical Analysis (회귀식 사용에 따른 화학 분석 과정의 불확도 처리 연구)

  • Woo, Jin-Chun;Suh, JungKee;Lim, MyungChul;Park, MinSu
    • Analytical Science and Technology
    • /
    • v.16 no.3
    • /
    • pp.185-190
    • /
    • 2003
  • We applied modified least square method (MLS) and ordinary least square method (OLS) to 1st order equation for the comparison of the uncertainties calculated by these methods. The uncertainty calculated by OLS covered statistically safe interval because it was over-estimated in many cases of measurement and concentration level. But, if the uncertainty of the concentration as a reference value was comparably large (about 5% of the relative standard deviation of random scattering from the regression line and about 7% of relative standard uncertainty of reference values), then uncertainty calculated by OLS was seriously under-estimated at high concentration level. It was revealed that the calculated uncertainty didn't cover statistically safe interval at the stated confidence level. It was found that the method, MLS, described in the previously article would be valid for this calculation of uncertainty.

Nonstationary Frequency Analysis at Seoul Using a Power Model (Power 모형을 이용한 서울지점 비정상성 빈도해석)

  • Lee, Gi-Chun;Kim, Gwang-Seob;Choi, Kyu-Hyun
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.461-461
    • /
    • 2012
  • 본 연구는 서울 지점의 목표연도(2040, 2070, 2100년)별 재현기간에 따른 확률강수량을 산정하기 위해 지속시간 24시간에 대한 연 최대 강수량 자료를 구축하여 비정상성 빈도해석을 수행하였다. 연 최대강수량 자료를 이용해 초기 20년을 기준으로 1년씩 추가한 연 최대 강수량 누적 자료를 구축한 후, 누적 기간별 자료의 평균, 위치매개변수, 축척매개변수를 산정하였다. Gumbel 분포를 이용해 비정상성 빈도해석을 실시하였으며, 각 매개변수의 경우 확률가중모멘트법을 이용해 산정하였다. 산정된 누적평균 강수량과 연도와의 선형회귀분석을 실시한 방법뿐만 아니라 서울 지점이 속한 한강유역의 전 지점들을 이용한 유역의 누적평균 강수량 자료에 대하여 연도와의 Logsitic 회귀분석 및 Power Model을 이용해 서울 지점의 목표연도별 누적평균 강수량을 산정하였고 이를 통해 목표연도별 위치매개변수 및 축척매개변수를 구해 목표연도별 재현기간에 따른 확률강수량을 산정하였다. 선형회귀분석을 이용한 비정상성 빈도해석의 경우, 목표연도가 증가함에 따라 선형적인 증가에 의해 매우 높은 누적평균 강수량이 나타나 확률강수량의 경우에도 정상성임을 가정한 확률강수량에 비해 매우 높게 나타나 타당한 확률강수량이라 함에 한계가 있음을 보였다. 유역의 평균거동과 Logistic 회귀분석을 실시하여 확률강수량을 산정하였을 때에는, 선형 회귀분석에 비해 정상성임을 가정한 확률강수량보다 크게 증가하지 않고 비교적 안정적인 증가가 나타났다. 하지만 Logistic 회귀분석을 이용한 누적평균 강수량 산정에 있어서 목표연도 2040년에 도달하기 전에 미리 수렴하는 형태를 보여 모든 목표연도의 확률강수량이 동일한 값을 가지는 한계가 나타났다. 한강 유역의 평균거동과 Power Model을 이용한 비정상성 빈도해석의 경우, 선형회귀분석 및 Logistic 회귀분석을 통한 비정상성 빈도해석에서 나타난 문제점을 보완할 수 있는 확률강수량이 나타남을 보였다.

  • PDF

Development of Multiple Linear Regression Model to Predict Agricultural Reservoir Storage based on Naive Bayes Classification and Weather Forecast Data (나이브 베이즈 분류와 기상예보자료 기반의 농업용 저수지 저수율 전망을 위한 저수율 예측 다중선형 회귀모형 개발)

  • Kim, Jin Uk;Jung, Chung Gil;Lee, Ji Wan;Kim, Seong Joon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2018.05a
    • /
    • pp.112-112
    • /
    • 2018
  • 최근 이상기후로 인한 국부적인 혹은 광역적인 가뭄이 빈번하게 발생하고 있는 추세이며 발생횟수 뿐 아니라 가뭄 심도 및 지속기간이 과거보다 크게 증가하여 그에 따른 피해가 커질 것으로 예측되고 있다. 특히, 2014~2015년도의 유례없는 가뭄으로 인해 저수지 용수공급이 제한되면서 많은 농가들이 피해를 입었다. 본 연구의 목적은 전국 농업용 저수지를 대상으로 기상청 3개월 예보자료를 활용 할 수 있는 농업용 저수지 저수율 다중선형 회귀 모형을 개발하여 저수율 전망정보를 생산하는 것이다. 본 연구에서는 전국에 적용 가능한 저수율 다중선형 회귀 모형개발을 위해 5개의 기상요소(강수량, 최고기온, 최저기온, 평균기온, 평균풍속)와 관측 저수지 저수율을 활용했다. 기상자료는 2002년부터 2017년까지의 기상청 63개 지상관측소로부터 기상관측자료를 수집하였다. 본 연구에서는 저수율 전망 단계를 세 단계로 나누었다. 첫 번째 단계로 농어촌공사에서 전국 511개 용수구역을 대상으로 군집분석 및 의사결정나무 분석을 통해 제시한 65개 대표저수지를 대상으로 기상자료 및 관측 저수율 자료를 이용하여 다중선형 회귀분석을 실시하였다. 수집한 기상요소와 저수율을 독립변수로 하여 월별 회귀식을 산정한 결과 결정계수($R^2$)는 0.51~0.95로 나타났다. 두 번째 단계로 대표저수지의 회귀분석 결과를 전국의 저수지로 확대하기 위해 나이브 베이즈 분류법을 적용하여 전국 3098개의 저수지를 65의 군집으로 분류하고 각각의 군집에 해당되는 월별 회귀식을 산정하였다. 마지막으로 전국 저수지로 산정된 회귀식과 농업 가뭄 예측을 위해 기상청의 GS5(Global Seasonal Forecasting System 5) 3개월 예보자료를 수집하여 회귀식에 적용해 2017년 전국 저수지의 3개월 저수율 전망정보를 생산하였다. 본 연구의 전국 저수지 군집결과 기반의 저수율 전망기술은 2017년도 관측 저수율과 비교한 결과 유의한 상관성을 나타냈으며 이 결과는 추후 농업용 저수지의 물 공급 및 농업가뭄 전망 자료로서 이용이 가능할 것으로 판단된다.

  • PDF

A Study on Patterning and Grading by the Impact of Traffic Culture Index (교통문화지수 영향요인에 의한 유형화와 영향정도에 관한 연구)

  • Jeong Cheal-Woo;Jung Hun-Young;Ko Sang-Sean
    • Journal of Navigation and Port Research
    • /
    • v.30 no.1 s.107
    • /
    • pp.35-43
    • /
    • 2006
  • This study suggests strategies to prevent traffic accidents by utilizing impact factors per each cluster and the typical patterns of 81 cities based on the statistical analysis of the data concerning the TCI which was developed from the partnership of the Traffic Safety Authority and the Green Traffic Movement Corporation in 2002 and 2003. The Principal Component Analysis and Cluster Analysis on impact factors and TCI result in 4 components and 4 clusters. Also as the results of Stepwise Multiple Regression Analysis examining the relationship between impact factors and TCI, R2 values of these models show high to all clusters. According to the results, we suggest strategies to prevent traffic accidents per cluster concretely and it is necessary to analyze how effective the invested facilities are in reducing traffic accidents in the future.

Simultaneous Determination of Tryptophan and Tyrosine by Spectrofluorimetry Using Multivariate Calibration Method (다변량 분석법을 이용한 Tryptophan과 Tyrosine의 형광분광법적 정량)

  • Lee, Sang-Hak;Park, Ju-Eun;Son, Beom-Mok
    • Journal of the Korean Chemical Society
    • /
    • v.46 no.4
    • /
    • pp.309-317
    • /
    • 2002
  • A spectrofluorimetric method for the simultaneous determination of amino acids (tryptophan and tyrosine) based on the application of multivariate calibration method such as principal component regression and partial least squares (PLS) to luminescence measurements has been studied. Emission spectra of synthetic mixtures of two amino acids were obtained at excitation wavelength of 257 ㎚. The calibration model in PCR and PLS was obtained from the spectral data in the range of 280-500 ㎚ for each standard of a calibration set of 32 standards, each containing different amounts of two amino acids. The relative standard error of prediction ($RSEP_a$) was obtained to assess the model goodness in quantifying each analyte in a validation set. The overall relative standard error of prediction ($RSEP_m$) for the mixture obtained from the results of a validation set, formed by 6 independent mixtures was also used to validate the present method.

Optimization of Crude Protein Recovery from Papaya Latex Extract Using Response Surface Methodology (반응표면 분석법을 이용한 Papaya 유액추출물에서 Crude Protein 회수 조건의 최적화)

  • Oh, Hoon-Il;Oh, Sang-Joon;Kim, Jeong-Mee
    • Korean Journal of Food Science and Technology
    • /
    • v.29 no.4
    • /
    • pp.752-757
    • /
    • 1997
  • Crude papain extracted at optimum condition was purified with an ethanol precipitation method. Four factors of protein recovery method were optimized by response surface methodology (RSM) and the function was expressed in terms of a quadratic polynomial equation. Adequacy of the model equation for optimum response values was tested and optimum conditions of protein recovery were 38.2 mg/mL of protein, ethanol concentration of 40% and precipitation temperature of $-8^{\circ}C$. The experimental value (68.97%) for recovery yield was closed to the predicted value (77.28%) under these conditions.

  • PDF

Modeling Methodology for Cold Tolerance Assessment of Pittosporum tobira (돈나무의 내한성 평가 모델링)

  • Kim, Inhea;Huh, Keun Young;Jung, Hyun Jong;Choi, Su Min;Park, Jae Hyoen
    • Horticultural Science & Technology
    • /
    • v.32 no.2
    • /
    • pp.241-251
    • /
    • 2014
  • This study was carried out to develop a simple, rapid and reliable assessment model to predict cold tolerance in Pittosporum tobira, a broad-leaved evergreen commonly used in the southern region of South Korea, which can minimize the possible experimental errors appeared in a electrolyte leakage test for cold tolerance assessment. The modeling procedure comprised of regrowth test and a electrolyte leakage test on the plants exposed to low temperature treatments. The lethal temperatures estimated from the methodological combinations of a electrolyte leakage test including tissue sampling, temperature treatment for potential electrical conductivity, and statistical analysis were compared to the results of the regrowth test. The highest temperature showing the survival rate lower than 50% obtained from the regrowth test was $-10^{\circ}C$ and the lethal was $-10^{\circ}C{\sim}-5^{\circ}C$. Based on the results of the regrowth test, several methodological combinations of electrolyte leakage tests were evaluated and the electrolyte leakage lethal temperatures estimated using leaf sample tissue and freeze-killing method were closest to the regrowth lethal temperature. Evaluating statistical analysis models, linear interpolation had a higher tendency to overestimate the cold tolerance than non-linear regression. Consequently, the optimal model for cold tolerance assessment of P. tobira is composed of evaluating electrolyte leakage from leaf sample tissue applying freeze-killing method for potential electrical conductivity and predicting lethal temperature through non-linear regression analysis.

The Policy Effect of Minimum Housing Standards: Differences-in-Differences Estimation (최저주거기준 설정의 정책 효과: 이중차분법 추정)

  • Yi, Gunmin
    • 한국사회정책
    • /
    • v.23 no.1
    • /
    • pp.25-59
    • /
    • 2016
  • This paper analyses the policy effect of minimum housing standards, using the fact that Seoul set the minimum housing standards in 1998. Because the whole country except Seoul did not set the minimum housing standards in 1998, we could find this situation as a quasi-experiment. In order to identify the policy effect of minimum housing standards, I compare decreasing amounts in the number of households below the threshold between Seoul and comparison regions from 1995 to 2000, using Differences-in-Differences method. I draw estimate of one-to-one comparison, using Gyeonggi province as a comparison region, and OLS estimate, utilizing the whole nation except Seoul as a comparison region, respectively, and compare two estimates. The former and the latter suggest that the setting of Seoul minimum housing standard in 1998 account for decreasing the number of households under the minimum housing standard, by about 216,638 and 325,149, respectively. The latter is statistically significant at the 0.001 level and the former is in the 95% confidence level of the latter. Therefore we could conclude that the setting of minimum housing standards contributes significantly to achieve the policy objectives, a decrease in the number of households, which are below the threshold.

Study on the Estimation of Duncan & Chang Model Parameters-initial Tangent Modulus and Ultimate Deviator Stress for Compacted Weathered Soil (다짐 풍화토의 Duncan & Chang 모델 매개변수-초기접선계수와 극한축차응력 산정에 관한 연구)

  • Yoo, Kunsun
    • Journal of the Korean GEO-environmental Society
    • /
    • v.19 no.12
    • /
    • pp.47-58
    • /
    • 2018
  • Duncan & Chang(1970) proposed the Duncan-Chang model that a linear relation of transformed stress-strain plots was reconstituted from a nonlinear relation of stress-strain curve of triaxial compression test using hyperbolic theory so as to estimate an initial tangent modulus and ultimate deviator stress for the soil specimen. Although the transformed stress-strain plots show a linear relationship theoretically, they actually show a nonlinearity at both low and high values of strain of the test. This phenomenon indicates that the stress-strain curve is not a complete form of a hyperbola. So, if linear regression analyses for the transformed stress-strain plot are performed over a full range of strain of a test, error in the estimation of their linear equations is unavoidable depending on ranges of strain with non-linearity. In order to reduce such an error, a modified regression analysis method is proposed in this study, in which linear regression analyses for transformed stress-strain plots are performed over the entire range of strain except the range the non-linearity is shown around starting and ending of the test, and then the initial tangent modulus and ultimate deviator stresses are calculated. Isotropically consolidated-drained triaxial compression tests were performed on compacted weathered soil with a modified Proctor density to obtain their model parameters. The modified regression analyses for transformed stress-strain plots were performed and analyzed results are compared with results estimated by 2 points method (Duncan et al., 1980). As a result of analyses, initial tangent moduli are about 4.0% higher and ultimate deviator stresses are about 2.9% lower than those values estimated by Duncan's 2 points method.