• 제목/요약/키워드: Penalized regression

검색결과 78건 처리시간 0.018초

Negative Binomial Varying Coefficient Partially Linear Models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.809-817
    • /
    • 2012
  • We propose a semiparametric inference for a generalized varying coefficient partially linear model(VCPLM) for negative binomial data. The VCPLM is useful to model real data in that varying coefficients are a special type of interaction between explanatory variables and partially linear models fit both parametric and nonparametric terms. The negative binomial distribution often arise in modelling count data which usually are overdispersed. The varying coefficient function estimators and regression parameters in generalized VCPLM are obtained by formulating a penalized likelihood through smoothing splines for negative binomial data when the shape parameter is known. The performance of the proposed method is then evaluated by simulations.

A numerical study on group quantile regression models

  • Kim, Doyoen;Jung, Yoonsuh
    • Communications for Statistical Applications and Methods
    • /
    • 제26권4호
    • /
    • pp.359-370
    • /
    • 2019
  • Grouping structures in covariates are often ignored in regression models. Recent statistical developments considering grouping structure shows clear advantages; however, reflecting the grouping structure on the quantile regression model has been relatively rare in the literature. Treating the grouping structure is usually conducted by employing a group penalty. In this work, we explore the idea of group penalty to the quantile regression models. The grouping structure is assumed to be known, which is commonly true for some cases. For example, group of dummy variables transformed from one categorical variable can be regarded as one group of covariates. We examine the group quantile regression models via two real data analyses and simulation studies that reveal the beneficial performance of group quantile regression models to the non-group version methods if there exists grouping structures among variables.

Semiparametric Regression Splines in Matched Case-Control Studies

  • Kim, In-Young;Carroll, Raymond J.;Cohen, Noah
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2003년도 춘계 학술발표회 논문집
    • /
    • pp.167-170
    • /
    • 2003
  • We develop semiparametric methods for matched case-control studies using regression splines. Three methods are developed: an approximate crossvalidation scheme to estimate the smoothing parameter inherent in regression splines, as well as Monte Carlo Expectation Maximization (MCEM) and Bayesian methods to fit the regression spline model. We compare the approximate cross-validation approach, MCEM and Bayesian approaches using simulation, showing that they appear approximately equally efficient, with the approximate cross-validation method being computationally the most convenient. An example from equine epidemiology that motivated the work is used to demonstrate our approaches.

  • PDF

A convenient approach for penalty parameter selection in robust lasso regression

  • Kim, Jongyoung;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.651-662
    • /
    • 2017
  • We propose an alternative procedure to select penalty parameter in $L_1$ penalized robust regression. This procedure is based on marginalization of prior distribution over the penalty parameter. Thus, resulting objective function does not include the penalty parameter due to marginalizing it out. In addition, its estimating algorithm automatically chooses a penalty parameter using the previous estimate of regression coefficients. The proposed approach bypasses cross validation as well as saves computing time. Variable-wise penalization also performs best in prediction and variable selection perspectives. Numerical studies using simulation data demonstrate the performance of our proposals. The proposed methods are applied to Boston housing data. Through simulation study and real data application we demonstrate that our proposals are competitive to or much better than cross-validation in prediction, variable selection, and computing time perspectives.

Variable selection in censored kernel regression

  • Choi, Kook-Lyeol;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제24권1호
    • /
    • pp.201-209
    • /
    • 2013
  • For censored regression, it is often the case that some input variables are not important, while some input variables are more important than others. We propose a novel algorithm for selecting such important input variables for censored kernel regression, which is based on the penalized regression with the weighted quadratic loss function for the censored data, where the weight is computed from the empirical survival function of the censoring variable. We employ the weighted version of ANOVA decomposition kernels to choose optimal subset of important input variables. Experimental results are then presented which indicate the performance of the proposed variable selection method.

Support vector quantile regression ensemble with bagging

  • Shim, Jooyong;Hwang, Changha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권3호
    • /
    • pp.677-684
    • /
    • 2014
  • Support vector quantile regression (SVQR) is capable of providing more complete description of the linear and nonlinear relationships among random variables. To improve the estimation performance of SVQR we propose to use SVQR ensemble with bagging (bootstrap aggregating), in which SVQRs are trained independently using the training data sets sampled randomly via a bootstrap method. Then, they are aggregated to obtain the estimator of the quantile regression function using the penalized objective function composed of check functions. Experimental results are then presented, which illustrate the performance of SVQR ensemble with bagging.

Bayesian Curve-Fitting in Semiparametric Small Area Models with Measurement Errors

  • Hwang, Jinseub;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제22권4호
    • /
    • pp.349-359
    • /
    • 2015
  • We study a semiparametric Bayesian approach to small area estimation under a nested error linear regression model with area level covariate subject to measurement error. Consideration is given to radial basis functions for the regression spline and knots on a grid of equally spaced sample quantiles of covariate with measurement errors in the nested error linear regression model setup. We conduct a hierarchical Bayesian structural measurement error model for small areas and prove the propriety of the joint posterior based on a given hierarchical Bayesian framework since some priors are defined non-informative improper priors that uses Markov Chain Monte Carlo methods to fit it. Our methodology is illustrated using numerical examples to compare possible models based on model adequacy criteria; in addition, analysis is conducted based on real data.

A two-step approach for variable selection in linear regression with measurement error

  • Song, Jiyeon;Shin, Seung Jun
    • Communications for Statistical Applications and Methods
    • /
    • 제26권1호
    • /
    • pp.47-55
    • /
    • 2019
  • It is important to identify informative variables in high dimensional data analysis; however, it becomes a challenging task when covariates are contaminated by measurement error due to the bias induced by measurement error. In this article, we present a two-step approach for variable selection in the presence of measurement error. In the first step, we directly select important variables from the contaminated covariates as if there is no measurement error. We then apply, in the following step, orthogonal regression to obtain the unbiased estimates of regression coefficients identified in the previous step. In addition, we propose a modification of the two-step approach to further enhance the variable selection performance. Various simulation studies demonstrate the promising performance of the proposed method.

Kernel Poisson Regression for Longitudinal Data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1353-1360
    • /
    • 2008
  • An estimating procedure is introduced for the nonlinear mixed-effect Poisson regression, for longitudinal study, where data from different subjects are independent whereas data from same subject are correlated. The proposed procedure provides the estimates of the mean function of the response variables, where the canonical parameter is related to the input vector in a nonlinear form. The generalized cross validation function is introduced to choose optimal hyper-parameters in the procedure. Experimental results are then presented, which indicate the performance of the proposed estimating procedure.

  • PDF