• Title/Summary/Keyword: Linear regression fit

Search Result 139, Processing Time 0.022 seconds

Bayesian curve-fitting with radial basis functions under functional measurement error model

  • Hwang, Jinseub;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.749-754
    • /
    • 2015
  • This article presents Bayesian approach to regression splines with knots on a grid of equally spaced sample quantiles of the independent variables under functional measurement error model.We consider small area model by using penalized splines of non-linear pattern. Specifically, in a basis functions of the regression spline, we use radial basis functions. To fit the model and estimate parameters we suggest a hierarchical Bayesian framework using Markov Chain Monte Carlo methodology. Furthermore, we illustrate the method in an application data. We check the convergence by a potential scale reduction factor and we use the posterior predictive p-value and the mean logarithmic conditional predictive ordinate to compar models.

A note on standardization in penalized regressions

  • Lee, Sangin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.505-516
    • /
    • 2015
  • We consider sparse high-dimensional linear regression models. Penalized regressions have been used as effective methods for variable selection and estimation in high-dimensional models. In penalized regressions, it is common practice to standardize variables before fitting a penalized model and then fit a penalized model with standardized variables. Finally, the estimated coefficients from a penalized model are recovered to the scale on original variables. However, these procedures produce a slightly different solution compared to the corresponding original penalized problem. In this paper, we investigate issues on the standardization of variables in penalized regressions and formulate the definition of the standardized penalized estimator. In addition, we compare the original penalized estimator with the standardized penalized estimator through simulation studies and real data analysis.

A Study on the Determination of Point Probability Rainfall-Depth in Korea by the LinearLeast Squares method (Seoul, Daegu and Mokpo) (회귀선에 의한 국내 지점 확률항우량산정에 관한 연구 (서울, 대구, 목포 지점을 중심으로))

  • 이원환;김재한
    • Water for future
    • /
    • v.9 no.1
    • /
    • pp.81-85
    • /
    • 1976
  • This study is tried to determine the probability rainfall-depth of Seoul, Daegu and Mokpo easily by using a regression line. The correlation between the probability rainfalldepth of each duration from 10-minute to 120-minute and return period is derived so as to become the linear least squares curve fit, and the analytical method that the probability rainfall-depth about the given duration is able to be gotten directory on it is studied. In this research, fair correlation among them is shown, and when the variables are transformed suitably, the application of this method to other points besides three cities are considered to be possible.

  • PDF

Estimation of Leak Frequency Function by Application of Non-linear Regression Analysis to Generic Data (비선형 회귀분석을 이용한 Generic 데이터 기반의 누출빈도함수 추정)

  • Yoon, Ik Keun;Dan, Seung Kyu;Jung, Ho Jin;Hong, Seong Kyeong
    • Journal of the Korean Society of Safety
    • /
    • v.35 no.5
    • /
    • pp.15-21
    • /
    • 2020
  • Quantitative risk assessment (QRA) is used as a legal or voluntary safety management tool for the hazardous material industry and the utilization of the method is gradually increasing. Therefore, a leak frequency analysis based on reliable generic data is a critical element in the evolution of QRA and safety technologies. The aim of this paper is to derive the leak frequency function that can be applied more flexibly in QRA based on OGP report with high reliability and global utilization. For the purpose, we first reviewed the data on the 16 equipments included in the OGP report and selected the predictors. And then we found good equations to fit the OGP data using non-linear regression analysis. The various expectation functions were applied to search for suitable parameter to serve as a meaningful reference in the future. The results of this analysis show that the best fitting parameter is found in the form of DNV function and connection function in natural logarithm. In conclusion, the average percentage error between the fitted and the original value is very small as 3 %, so the derived prediction function can be applicable in the quantitative frequency analysis. This study is to contribute to expand the applicability of QRA and advance safety engineering as providing the generic equations for practical leak frequency analysis.

A Climate Prediction Method Based on EMD and Ensemble Prediction Technique

  • Bi, Shuoben;Bi, Shengjie;Chen, Xuan;Ji, Han;Lu, Ying
    • Asia-Pacific Journal of Atmospheric Sciences
    • /
    • v.54 no.4
    • /
    • pp.611-622
    • /
    • 2018
  • Observed climate data are processed under the assumption that their time series are stationary, as in multi-step temperature and precipitation prediction, which usually leads to low prediction accuracy. If a climate system model is based on a single prediction model, the prediction results contain significant uncertainty. In order to overcome this drawback, this study uses a method that integrates ensemble prediction and a stepwise regression model based on a mean-valued generation function. In addition, it utilizes empirical mode decomposition (EMD), which is a new method of handling time series. First, a non-stationary time series is decomposed into a series of intrinsic mode functions (IMFs), which are stationary and multi-scale. Then, a different prediction model is constructed for each component of the IMF using numerical ensemble prediction combined with stepwise regression analysis. Finally, the results are fit to a linear regression model, and a short-term climate prediction system is established using the Visual Studio development platform. The model is validated using temperature data from February 1957 to 2005 from 88 weather stations in Guangxi, China. The results show that compared to single-model prediction methods, the EMD and ensemble prediction model is more effective for forecasting climate change and abrupt climate shifts when using historical data for multi-step prediction.

Comparison study on kernel type estimators of discontinuous log-variance (불연속 로그분산함수의 커널추정량들의 비교 연구)

  • Huh, Jib
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.87-95
    • /
    • 2014
  • In the regression model, Kang and Huh (2006) studied the estimation of the discontinuous variance function using the Nadaraya-Watson estimator with the squared residuals. The local linear estimator of the log-variance function, which may have the whole real number, was proposed by Huh (2013) based on the kernel weighted local-likelihood of the ${\chi}^2$-distribution. Chen et al. (2009) estimated the continuous variance function using the local linear fit with the log-squared residuals. In this paper, the estimator of the discontinuous log-variance function itself or its derivative using Chen et al. (2009)'s estimator. Numerical works investigate the performances of the estimators with simulated examples.

Validity for Use of Non-HDL Cholesterol Rather than LDL Cholesterol

  • Kwon, Se-Young;Na, Young-Ak
    • Korean Journal of Clinical Laboratory Science
    • /
    • v.45 no.2
    • /
    • pp.54-59
    • /
    • 2013
  • NonHDL cholesterol values have been suggested as a risk marker for cardiovascular disease. NonHDL cholesterol values were calculated, using a very simple measurement [nonHDL cholesterol=serum total cholesterol-HDL cholesterol]. This formula is very useful as a screening tool for identifying dyslipoproteinemias, risk assessment, and assessing the results of hypolipidemic therapy. The data from the 2009 Korean National Health and Nutrition Examination Survey were used. Analysis was done for 1,992 subjects with lipid panels (Cholesterol, HDL, LDLdirect and Triglycerides) results. We studied the relationship between nonHDL cholesterol and LDL cholesterol. As a result, nonHDL cholesterol values were plotted against the LDL direct and calculated values. The linear regression equation for nonHDL cholesterol and direct LDL cholesterol was $nonHDLchol=23.60+1.03{\times}LDLdirect$ (p<0.0001, $r^2=0.80$) in all subjects. The subjects were classified into triglyceride values. When triglycerides are below 400 mg/dL, the linear fit to LDL direct is found to be $[nonHDLchol=17.34+1.07{\times}LDLdirect]$ (p<0.0001, $r^2=0.88$) and to the Friedewald LDL calculation is $[nonHDLchol=23.10+1.02{\times}LDLcalc]$ (p<0.0001, $r^2=0.82$). For triglycerides above 400 mg/dL, the linear fit equation is $[nonHDLchol=87.57+0.92{\times}LDLdirect]$ (p<0.0001, $r^2=0.50$) and to the LDL calculated, it is $[nonHDLchol=142.70+0.50{\times}LDLcalc]$ (p<0.0001, $r^2=0.32$). This study provides examples of the utility of nonHDL cholesterol concentrations in clinical medicine.

  • PDF

Predicting standardized ileal digestibility of lysine in full-fat soybeans using chemical composition and physical characteristics

  • Chanwit Kaewtapee;Rainer Mosenthin
    • Animal Bioscience
    • /
    • v.37 no.6
    • /
    • pp.1077-1084
    • /
    • 2024
  • Objective: The present work was conducted to evaluate suitable variables and develop prediction equations using chemical composition and physical characteristics for estimating standardized ileal digestibility (SID) of lysine (Lys) in full-fat soybeans (FFSB). Methods: The chemical composition and physical characteristics were determined including trypsin inhibitor activity (TIA), urease activity (UA), protein solubility in 0.2% potassium hydroxide (KOH), protein dispersibility index (PDI), lysine to crude protein ratio (Lys:CP), reactive Lys:CP ratio, neutral detergent fiber, neutral detergent insoluble nitrogen (NDIN), acid detergent insoluble nitrogen (ADIN), acid detergent fiber, L* (lightness), and a* (redness). Pearson's correlation (r) was computed, and the relationship between variables was determined by linear or quadratic regression. Stepwise multiple regression was performed to develop prediction equations for SID of Lys. Results: Negative correlations (p<0.01) between SID of Lys and protein quality indicators were observed for TIA (r = -0.80), PDI (r = -0.80), and UA (r = -0.76). The SID of Lys also showed a quadratic response (p<0.01) to UA, NDIN, TIA, L*, KOH, a* and Lys:CP. The best-fit model for predicting SID of Lys in FFSB included TIA, UA, NDIN, and ADIN, resulting in the highest coefficient of determination (R2 = 0.94). Conclusion: Quadratic regression with one variable indicated the high accuracy for UA, NDIN, TIA, and PDI. The multiple linear regression including TIA, UA, NDIN, and ADIN is an alternative model used to predict SID of Lys in FFSB to improve the accuracy. Therefore, multiple indicators are warranted to assess either insufficient or excessive heat treatment accurately, which can be employed by the feed industry as measures for quality control purposes to predict SID of Lys in FFSB.

Determining Input Values for Dragging Anchor Assessments Using Regression Analysis (회귀분석을 이용한 주묘 위험성 평가 입력요소 결정에 관한 연구)

  • Kang, Byung-Sun;Jung, Chang-Hyun
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.27 no.6
    • /
    • pp.822-831
    • /
    • 2021
  • Although programs have been developed to evaluate the risk of dragging anchors, it is practically difficult for VTS(vessel traffic service) operators to calculate and evaluate these risks by obtaining input factors from anchored ships. Therefore, in this study, the gross tonnage (GT) that could be easily obtained from the ship by the VTS operators was set as an independent variable, and linear and nonlinear regression analyses were performed using the input factors as the dependent variables. From comparing the fit of the polynomial model (linear) and power series model (nonlinear), the power series model was evaluated to be more suitable for all input factors in the case of container ships and bulk carriers. However, in the case of tanker ships, the power supply model was suitable for the LBP(length between perpendiculars), width, and draft, and the polynomial model was evaluated to be more suitable for the front wind pressure area, weight of the anchor, equipment number, and height of the hawse pipe from the bottom of the ship. In addition, all other dependent variables, except for the front wind pressure area factor of the tanker ship, showed high degrees of fit with a coefficient of determination (R-squared value) of 0.7 or more. Therefore, among the input factors of the dragging anchor risk assessment program, all factors except the external force, seabed quality, water depth, and amount of anchor chain let out are automatically applied by the regression analysis model formula when only the GT of the ship is provided.

An empirical bracketed duration relation for stable continental regions of North America

  • Lee, Jongwon;Green, Russell A.
    • Earthquakes and Structures
    • /
    • v.3 no.1
    • /
    • pp.1-15
    • /
    • 2012
  • An empirical predictive relationship correlating bracketed duration to earthquake magnitude, site-to-source distance, and local site conditions (i.e. rock vs. stiff soil) for stable continental regions of North America is presented herein. The correlation was developed from data from 620 horizontal motions for central and eastern North America (CENA), consisting of 28 recorded motions and 592 scaled motions. The bracketed duration data was comprised of nonzero and zero durations. The non-linear mixed-effects regression technique was used to fit a predictive model to the nonzero duration data. To account for the zero duration data, logistic regression was conducted to model the probability of zero duration occurrences. Then, the probability models were applied as weighting functions to the NLME regression results. Comparing the bracketed durations for CENA motions with those from active shallow crustal regions (e.g. western North America: WNA), the motions in CENA have longer bracketed durations than those in the WNA. Especially for larger magnitudes at far distances, the bracketed durations in CENA tend to be significantly longer than those in WNA.