• 제목/요약/키워드: coefficient of determination (R-square)

검색결과 165건 처리시간 0.027초

Note on Use of $R^2$ for No-intercept Model

  • Do, Jong-Doo;Kim, Tae-Yoon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제17권2호
    • /
    • pp.661-668
    • /
    • 2006
  • There have been some controversies on the use of the coefficient of determination for linear no-intercept model. One definition of the coefficient of determination, $R^2={\sum}\;{\widehat{y^2}}\;/\;{\sum}\;y^2$, is being widely accepted only for linear no-intercept models though Kvalseth (1985) demonstrated some possible pitfalls in using such $R^2$. Main objective of this note is to report that $R^2$ is not a desirable measure of fit for the no-intercept linear model. In fact it is found that mean square error(MSE) could replace $R^2$ efficiently in most cases where selection of no-intercept model is at issue.

  • PDF

잉여생산량을 추정하는 모델과 파라미터 추정방법의 비교 (Comparison of models for estimating surplus productions and methods for estimating their parameters)

  • 권유정;장창익;표희동;서영일
    • 수산해양기술연구
    • /
    • 제49권1호
    • /
    • pp.18-28
    • /
    • 2013
  • It was compared the estimated parameters by the surplus production from three different models, i.e., three types (Schaefer, Gulland, and Schnute) of the traditional surplus production models, a stock production model incorporating covariates (ASPIC) model and a maximum entropy (ME) model. We also evaluated the performance of models in the estimation of their parameters. The maximum sustainable yield (MSY) of small yellow croaker (Pseudosciaena polyactis) in Korean waters ranged from 35,061 metric tons (mt) by Gulland model to 44,844mt by ME model, and fishing effort at MSY ($f_{MSY}$) ranged from 262,188hauls by Schnute model to 355,200hauls by ME model. The lowest root mean square error (RMSE) for small yellow croaker was obtained from the Gulland surplus production model, while the highest RMSE was from Schnute model. However, the highest coefficient of determination ($R^2$) was from the ME model, but the ASPIC model yielded the lowest coefficient. On the other hand, the MSY of Kapenta (Limnothrissa miodon) ranged from 16,880 mt by ASPIC model to 25,373mt by ME model, and $f_{MSY}$, from 94,580hauls by ASPIC model to 225,490hauls by Schnute model. In this case, both the lowest root mean square error (RMSE) and the highest coefficient of determination ($R^2$) were obtained from the ME model, which showed relatively better fits of data to the model, indicating that the ME model is statistically more stable and robust than other models. Moreover, the ME model could provide additional ecologically useful parameters such as, biomass at MSY ($B_{MSY}$), carrying capacity of the population (K), catchability coefficient (q) and the intrinsic rate of population growth (r).

Nondestructive Prediction of Fatty Acid Composition in Sesame Seeds by Near Infrared Reflectance Spectroscopy

  • Kim, Kwan-Su;Park, Si-Hyung;Choung, Myoung-Gun;Kim, Sun-Lim
    • 한국작물학회지
    • /
    • 제51권spc1호
    • /
    • pp.304-309
    • /
    • 2006
  • Near infrared reflectance spectroscopy (NIRS) was used to develop a rapid and nondestructive method for the determination of fatty acid composition in sesame (Sesamum indicum L.) seed oil. A total of ninety-three samples of intact seeds were scanned in the reflectance mode of a scanning monochromator, and reference values for fatty acid composition were measured by gas-liquid chromatography. Calibration equations were developed using modified partial least square regression with internal cross validation (n=63). The equations obtained had low standard errors of cross-validation and moderate $R^2$ (coefficient of determination in calibration). Prediction of an external validation set (n=30) showed significant correlation between reference values and NIRS estimated values based on the SEP (standard error of prediction), $r^2$ (coefficient of determination in prediction) and the ratio of standard deviation (SD) of reference data to SEP. The models developed in this study had relatively higher values (more than 2.0) of SD/SEP(C) for oleic and linoleic acid, having good correlation between reference and NIRS estimate. The results indicated that NIRS, a nondestructive screening method could be used to rapidly determine fatty acid composition in sesame seeds in the breeding programs for high quality sesame oil.

The analysis of oat chemical properties using visible-near infrared spectroscopy

  • Jang, Hyeon Jun;Choi, Chang Hyun;Choi, Tae Hyun;Kim, Jong Hun;Kwon, Gi Hyeon;Oh, Seung Il;Kim, Hoon;Kim, Yong Joo
    • 농업과학연구
    • /
    • 제43권5호
    • /
    • pp.715-722
    • /
    • 2016
  • Rapid determination of food quality is important in food distribution. In this study, the chemical properties of oats were analyzed using visible-near infrared (VIS-NIR) spectroscopy. The objective of this study was to develop and validate a predictive model of oat quality by VIS-NIR spectroscopy. A total of 200 oat samples were collected from domestic and import markets. Reflectance spectra, moisture, protein, fat, Fe, and K of oat samples were measured. Reflectance spectra were measured in the wavelength range of 400 - 2,500 nm at 2 nm intervals. The reflectance spectrum of an oat sample was measured after sample cell and reflectance plate spectrum measurement. Preprocessing methods such as normalization and $1^{st}$ and $2^{nd}$ derivations were used to minimize the spectroscopic noise. The partial-least-square (PLS) models were developed to predict chemical properties of oats using a commercial software package, Unscrambler. The PLS models showed the possibility to predict moisture, protein, and fat content of oat samples. The coefficient of determination ($R^2$) of moisture, protein, and fat was greater than 0.89. However, it was hard to predict Fe and K concentrations due to their low concentrations in the oat samples. The coefficient of determinations of Fe and K were 0.57 and 0.77, respectively. In future studies, the stability and practicability of these models should be improved by using a high accuracy spectrophotometer and by performing calibrations with a wider range of oat chemicals.

기계학습을 이용한 염화물 확산계수 예측모델 개발 (Development of Prediction Model of Chloride Diffusion Coefficient using Machine Learning)

  • 김현수
    • 한국공간구조학회논문집
    • /
    • 제23권3호
    • /
    • pp.87-94
    • /
    • 2023
  • Chloride is one of the most common threats to reinforced concrete (RC) durability. Alkaline environment of concrete makes a passive layer on the surface of reinforcement bars that prevents the bar from corrosion. However, when the chloride concentration amount at the reinforcement bar reaches a certain level, deterioration of the passive protection layer occurs, causing corrosion and ultimately reducing the structure's safety and durability. Therefore, understanding the chloride diffusion and its prediction are important to evaluate the safety and durability of RC structure. In this study, the chloride diffusion coefficient is predicted by machine learning techniques. Various machine learning techniques such as multiple linear regression, decision tree, random forest, support vector machine, artificial neural networks, extreme gradient boosting annd k-nearest neighbor were used and accuracy of there models were compared. In order to evaluate the accuracy, root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE) and coefficient of determination (R2) were used as prediction performance indices. The k-fold cross-validation procedure was used to estimate the performance of machine learning models when making predictions on data not used during training. Grid search was applied to hyperparameter optimization. It has been shown from numerical simulation that ensemble learning methods such as random forest and extreme gradient boosting successfully predicted the chloride diffusion coefficient and artificial neural networks also provided accurate result.

Fuzzy logic approach for estimating bond behavior of lightweight concrete

  • Arslan, Mehmet E.;Durmus, Ahmet
    • Computers and Concrete
    • /
    • 제14권3호
    • /
    • pp.233-245
    • /
    • 2014
  • In this paper, a rule based Mamdani type fuzzy logic model for prediction of slippage at maximum tensile strength and slippage at rupture of structural lightweight concretes were discussed. In the model steel rebar diameters and development lengths were used as inputs. The FL model and experimental results, the coefficient of determination R2, the Root Mean Square Error were used as evaluation criteria for comparison. It was concluded that FL was practical method for predicting slippage at maximum tensile strength and slippage at rupture of structural lightweight concretes.

가시광 및 근적외선 투과분광법을 이용한 감염 씨감자 온라인 선별시스템 개발 (Development of On-line Sorting System for Detection of Infected Seed Potatoes Using Visible Near-Infrared Transmittance Spectral Technique)

  • 김대용;모창연;강점순;조병관
    • 비파괴검사학회지
    • /
    • 제35권1호
    • /
    • pp.1-11
    • /
    • 2015
  • 본 연구에서는 온라인 감염 씨감자 비파괴선별 시스템을 구축하고 감염 씨감자 선별을 위한 통계적 모델을 개발하여 적용함으로써 선별시스템의 성능을 평가하였다. 선별모델 개발을 위해 토양병 및 잠복 감염의 대표적인 병원성 세균인 pectobacteruim atrosepticum을 인위적으로 씨감자에 감염시켜 씨감자 내부에 병징이 발현되도록 하여 실험하였다. 구축된 선별시스템을 통해 감염 및 정상 씨감자의 투과스펙트럼을 획득한 후 최소자승판별법(partial least square-discriminant analysis)을 이용하여 감염 씨감자 검출모델을 개발하였다. 개발된 모델의 검정결정계수는($R^2$) 0.943이었고 분류의 정확도는 99%(n=80) 이상으로 우수한 선별성능을 보였다. 개발된 온라인 감염 씨감자 선별시스템은 씨감자 선별뿐만 아니라 다양한 농산물의 감염을 검출하는 기반기술로 응용이 가능할 것으로 판단된다.

수명분포가 자유도에 의존한 카이제곱분포를 따르는 무한고장 NHPP 소프트웨어 신뢰성 모형에 관한 비교연구 (A Comparative Study on the Infinite NHPP Software Reliability Model Following Chi-Square Distribution with Lifetime Distribution Dependent on Degrees of Freedom)

  • 김희철;김재욱
    • 한국정보전자통신기술학회논문지
    • /
    • 제10권5호
    • /
    • pp.372-379
    • /
    • 2017
  • 소프트웨어 개발과정동안 소프트웨어 신뢰성 요인은 매우 기본적인 사항이다. 소프트웨어 고장파악을 위한 무한고장 비동질적인 포아송 과정을 이용할 때 고장발생률 혹은 위험함수가 일정하거나 증가 또는 감소하는 속성을 가진다. 본 논문에서는 소프트웨어 신뢰 성능에 관한 효율성을 비교하는 자유도에 의존하는 카이제곱 분포를 적용한 신뢰성 모형을 제안하였다. 효율적인 모형을 평가하기 위하여 평균제곱오차(MSE)와 결정계수($R^2$)를 이용하고 최우추정법과 수치 해석적 방법을 사용하여 모수추정 알고리즘이 수행되었다. 제안하는 카이제곱분포의 자유도를 이용한 신뢰성 모형을 위해 실제 고장 간격 데이터를 사용한 고장 성능 분석이 적용되었다. 고장데이터 분석은 카이제곱분포의 자유도에 근거한 강도함수를 기준으로 비교되었다. 데이터 신뢰성을 확인하기 위하여 라플라스 추세검정이 적용되었다. 본 연구에 제안된 카이제곱분포의 자유도는 다양한 고장현상을 표현 할 수 있기 때문에 (결정계수가 90% 이상), 신뢰성 분야에서 활용 할 수 있는 모형으로 활용 할 수 있다. 이 연구 결과를 적용하면 소프트웨어 개발 설계자에게 다양한 자유도를 적용하여 소프트웨어 고장패턴을 예측함으로서 효율적인 모형을 개발하는데 표준 지침으로 적용 할 수 있다.

오토폼을 이용한 돼지 뒷다리 중량예측 연구 (Prediction of ham weight with the autofom in Korea)

  • 배진규;이영규;박범영;임효선;정봉수
    • 한국동물위생학회지
    • /
    • 제39권1호
    • /
    • pp.7-12
    • /
    • 2016
  • The Autofom is a equipment for predicting the amount of pig carcasses meat using the 16 ultrasonic sensors to measure in real time and it was established in Dodram LPC in Gyeonggi Province of Korea for the first time. This study was carried out to validate the reliability of Autofom statistically and to establish guideline for developing a analytic formula through comparing the measurement between Autofom and dissection. The ham parts of sixty-six pig carcasses were measured with Autofom and by two experimental performers. The weight means and standard deviations of ham parts including bone by measurements with Autofom and dissection were $10.69{\pm}0.81kg$ and $10.77{\pm}0.94kg$, respectively a strong positive correlation (P<0.01) was identified, with a coefficient of determination ($R^2$) of 0.82. The weight means and standard deviations of lean ham parts by measurements with Autofom and dissection were $7.41{\pm}0.58kg$ and $7.42{\pm}0.89kg$, respectively a strong positive correlation (P<0.01) was identified, with a coefficient of determination ($R^2$) of 0.72. The root mean square errors of two groups were 0.40 and 0.50, respectively.

도심지 미진동 제어발파에서 진동분석을 통한 안전 발파설계에 관한 연구(II) - 진동측정 자료의 통계적 분석을 위주로 - (A Study on the Safe Blasting Design by Statistical Analysis of Ground Vibration for Vibration Controlled Blasting in Urban Area (II))

  • 김영환;안명석;박종남;강대우;이창우
    • 화약ㆍ발파
    • /
    • 제18권2호
    • /
    • pp.7-13
    • /
    • 2000
  • 본 연구지역은 안산암지역으로 지반의 구조특성을 잘 나타내는 균열계수로서 암반특성을 표현하였고 발파진동식을 추정하는데 있어서 결정계수를 높여 오차를 최소화하였다. 측정자료 를 누적분석하였을 때 결정계수가 0.002~0.531로서 신뢰하기 어려웠으며 동일 장약량을 가진 동일거리군 군별 평균진동속도로서 회귀분석한 경우 결정계수는0.493~0.531으로 그다지 높지 않은 결과가 나왔고 절사평균을 이용한 결정계수는 0.307~0.487로서 역시 신뢰하기 어려운 결과를 도출했다 또한 샘플수를 가중치로 적용하는 방법의 결정계수는 0.644~0.752로서 본 연구의 적용 통계적 방법중 가장 높은 결과를 도출하였으며, 진동속도 표준편차의 영향을 가중치로 적용하는 방법의 결정계수는 0.516~0.668이었고 진동속도 분산의 영향을 가중치로 적용하는 방법의 결정계수는 0.516~0.685이었다. 그러므로 발파진동추정식을 산출할 때 동일장약량을 가지는 15m이내의 동일거리군에서의 진동평균속도에 가중치를 적용하여 얻은 회귀분석 결과가 가장 신뢰성이 높았다. 이 때 자승근일 때의 발파진동상수 $K_{95}$는 317.4, n은 -1.66이었고, 삼승근일 때의 발파진동상수 $K_{95}$는 209.9,n은 -1.60이었고 자승근과 삼승근의 교차점분석시 허용진동속도 4cm/sec에서 교차점은 31m이므로 발파지점으로부터의 거리가 31m이내는 삼승근 적용이 신뢰성이 높고, 31m이상일 때는 자승근 적용이 신뢰성이 높은 것으로 판단되었다.

  • PDF