• Title/Summary/Keyword: generalized linear models

Search Result 222, Processing Time 0.024 seconds

Automatic order selection procedure for count time series models (계수형 시계열 모형을 위한 자동화 차수 선택 알고리즘)

  • Ji, Yunmi;Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.2
    • /
    • pp.147-160
    • /
    • 2020
  • In this paper, we study an algorithm that automatically determines the orders of past observations and conditional mean values that play an important role in count time series models. Based on the orders of the ARIMA model, the algorithm constitutes the order candidates group for time series generalized linear models and selects the final model based on information criterion among the combinations of the order candidates group. To evaluate the proposed algorithm, we perform small simulations and empirical analysis according to underlying models and time series as well as compare forecasting performances with the ARIMA model. The results of the comparison confirm that the time series generalized linear model offers better performance than the ARIMA model for the count time series analysis. In addition, the empirical analysis shows better performance in mid and long term forecasting than the ARIMA model.

Comparison of Regression Models for Estimating Ventilation Rate of Mechanically Ventilated Swine Farm (강제환기식 돈사의 환기량 추정을 위한 회귀모델의 비교)

  • Jo, Gwanggon;Ha, Taehwan;Yoon, Sanghoo;Jang, Yuna;Jung, Minwoong
    • Journal of The Korean Society of Agricultural Engineers
    • /
    • v.62 no.1
    • /
    • pp.61-70
    • /
    • 2020
  • To estimate the ventilation volume of mechanically ventilated swine farms, various regression models were applied, and errors were compared to select the regression model that can best simulate actual data. Linear regression, linear spline, polynomial regression (degrees 2 and 3), logistic curve, generalized additive model (GAM), and gompertz curve were compared. Overfitting models were excluded even when the error rate was small. The evaluation criteria were root mean square error (RMSE) and mean absolute percentage error (MAPE). The evaluation results indicated that degree 3 exhibited the lowest error rate; however, an overestimation contradiction was observed in a certain section. The logistic curve was the most stable and superior to all the models. In the estimation of ventilation volume by all of the models, the estimated ventilation volume of the logistic curve was the smallest except for the model with a large error rate and the overestimated model.

Second-Order REML for Random Effects Models

  • Ha, Il-Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.1
    • /
    • pp.19-25
    • /
    • 2001
  • Random effects models which describe the dependence via random effects in various correlated data have recently received considerable attention in the biomedical literature. They include mixed linear models (MLMs), generatized linear mixed models (GLMMS) and hierarchical generalized linear models (HGLMs). For the inference Lee and Nelder (2000) proposed the first-and second-order REML (restricted maximum likelihood) methods based on hierarchical-likelihood of tee and Welder (1996). In this paper, for Poisson-gamma HGLMs the new methods are theoretically compared with marginal likelihood methods and both methods are illustrated by two practical examples.

  • PDF

Bayesian Modeling of Random Effects Covariance Matrix for Generalized Linear Mixed Models

  • Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.3
    • /
    • pp.235-240
    • /
    • 2013
  • Generalized linear mixed models(GLMMs) are frequently used for the analysis of longitudinal categorical data when the subject-specific effects is of interest. In GLMMs, the structure of the random effects covariance matrix is important for the estimation of fixed effects and to explain subject and time variations. The estimation of the matrix is not simple because of the high dimension and the positive definiteness; subsequently, we practically use the simple structure of the covariance matrix such as AR(1). However, this strong assumption can result in biased estimates of the fixed effects. In this paper, we introduce Bayesian modeling approaches for the random effects covariance matrix using a modified Cholesky decomposition. The modified Cholesky decomposition approach has been used to explain a heterogenous random effects covariance matrix and the subsequent estimated covariance matrix will be positive definite. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using these methods.

Modelling Heterogeneity in Fertility for Analysis of Variety Trials (밭의 비옥도를 고려한 품종실험 분석)

  • 윤성철;강위창;이영조;임용빈
    • The Korean Journal of Applied Statistics
    • /
    • v.11 no.2
    • /
    • pp.423-433
    • /
    • 1998
  • In agricultural field experiments, the completely randomized block design is often used for the analysis of variety trials. An important assumption is that every experimental unit in each block has the some fertility. But, in most agricultural field experiments there often exists a systematic heterogeneity in fertility among the experimental units. To account for the heterogeneity, we propose to use the hierarchical generalized linear models. We compare our analysis of the data from Scottish Agricultural colleges list with that using Markov chain Monte Carlo method.

  • PDF

A Graphical Method of Checking the Adequacy of Linear Systematic Component in Generalized Linear Models (일반화선형모형에서 선형성의 타당성을 진단하는 그래프)

  • Kim, Ji-Hyun
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.1
    • /
    • pp.27-41
    • /
    • 2008
  • A graphical method of checking the adequacy of a generalized linear model is proposed. The graph helps to assess the assumption that the link function of mean can be expressed as a linear combination of explanatory variables in the generalized linear model. For the graph the boosting technique is applied to estimate nonparametrically the relationship between the link function of the mean and the explanatory variables, though any other nonparametric regression methods can be applied. Through simulation studies with normal and binary data, the effectiveness of the graph is demonstrated. And we list some limitations and technical details of the graph.

Empirical Comparisons of Disparity Measures for Three Dimensional Log-Linear Models

  • Park, Y.S.;Hong, C.S.;Jeong, D.B.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.543-557
    • /
    • 2006
  • This paper is concerned with the applicability of the chi-square approximation to the six disparity statistics: the Pearson chi-square, the generalized likelihood ratio, the power divergence, the blended weight chi-square, the blended weight Hellinger distance, and the negative exponential disparity statistic. Three dimensional contingency tables of small and moderate sample sizes are generated to be fitted to all possible hierarchical log-linear models: the completely independent model, the conditionally independent model, the partial association models, and the model with one variable independent of the other two. For models with direct solutions of expected cell counts, point estimates and confidence intervals of the 90 and 95 percentage points of six statistics are explored. For model without direct solutions, the empirical significant levels and the empirical powers of six statistics to test the significance of the three factor interaction are computed and compared.

  • PDF

Bootstrap Estimation for GEE Models (일반화추정방정식(GEE)에 대한 부스트랩의 적용)

  • Park, Chong-Sun;Jeon, Yong-Moon
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.207-216
    • /
    • 2011
  • Bootstrap is a resampling technique to find an estimate of parameters or to evaluate the estimate. This technique has been used in estimating parameters in linear model(LM) and generalized linear model(GLM). In this paper, we explore the possibility of applying Bootstrapping Residuals, Pairs, and an Estimating Equation that are most widely used in LM and GLM to the generalized estimating equation(GEE) algorithm for modelling repeatedly measured regression data sets. We compared three bootstrapping methods with coefficient and standard error estimates of GEE models from one simulated and one real data set. Overall, the estimates obtained from bootstrap methods are quite comparable, except that estimates from bootstrapping pairs are somewhat different from others. We conjecture that the strange behavior of estimates from bootstrapping pairs comes from the inconsistency of those estimates. However, we need a more thorough simulation study to generalize it since those results are coming from only two small data sets.

Multicollinarity in Logistic Regression

  • Jong-Han lee;Myung-Hoe Huh
    • Communications for Statistical Applications and Methods
    • /
    • v.2 no.2
    • /
    • pp.303-309
    • /
    • 1995
  • Many measures to detect multicollinearity in linear regression have been proposed in statistics and numerical analysis literature. Among them, condition number and variance inflation factor(VIF) are most popular. In this study, we give new interpretations of condition number and VIF in linear regression, using geometry on the explanatory space. In the same line, we derive natural measures of condition number and VIF for logistic regression. These computer intensive measures can be easily extended to evaluate multicollinearity in generalized linear models.

  • PDF