• 제목/요약/키워드: penalized spline regression

검색결과 11건 처리시간 0.019초

벌점 스플라인 회귀모형에서의 이상치 탐지방법 (An Outlier Detection Method in Penalized Spline Regression Models)

  • 서한손;송지은;윤민
    • 응용통계연구
    • /
    • 제26권4호
    • /
    • pp.687-696
    • /
    • 2013
  • 이상치가 존재하는 경우 모형 적합의 결과가 왜곡될 수 있기 때문에 이상치 탐색은 데이터분석에 있어서 매우 중요하다. 이상치 탐지 방법은 많은 학자들에 의해 연구되어 왔다. 본 논문에서는 Hadi와 Simonoff (1993)가 제안한 직접적 이상치 탐지 방법을 벌점 스플라인 회귀모형에 적용하여 이상치를 탐지하는 과정을 제안하며 모의실험과 실제 데이터에 적용을 통하여 스플라인 회귀모형, 강건 벌점 스플라인 회귀모형과 효율성을 비교한다.

A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment

  • Kim, Byung-Soo;Kim, Sang-Cheol
    • 응용통계연구
    • /
    • 제22권1호
    • /
    • pp.115-127
    • /
    • 2009
  • The purpose of statistical analyses of array-CGH experiment data is to divide the whole genome into regions of equal copy number, to quantify the copy number in each region and finally to evaluate its significance of being different from two. Several statistical procedures have been proposed which include the circular binary segmentation, and a Gaussian based local regression for detecting break points (GLAD) by estimating a piecewise constant function. We propose in this note a penalized spline regression and its simultaneous confidence band(SCB) approach to evaluate the statistical significance of regions of genetic gain/loss. The region of which the simultaneous confidence band stays above 0 or below 0 can be considered as a region of genetic gain or loss. We compare the performance of the SCB procedure with GLAD and hidden Markov model approaches through a simulation study in which the data were generated from AR(1) and AR(2) models to reflect spatial dependence of the array-CGH data in addition to the independence model. We found that the SCB method is more sensitive in detecting the low level copy number alterations.

Semiparametric Bayesian Estimation under Structural Measurement Error Model

  • Hwang, Jin-Seub;Kim, Dal-Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제17권4호
    • /
    • pp.551-560
    • /
    • 2010
  • This paper considers a Bayesian approach to modeling a flexible regression function under structural measurement error model. The regression function is modeled based on semiparametric regression with penalized splines. Model fitting and parameter estimation are carried out in a hierarchical Bayesian framework using Markov chain Monte Carlo methodology. Their performances are compared with those of the estimators under structural measurement error model without a semiparametric component.

Semiparametric Bayesian estimation under functional measurement error model

  • Hwang, Jin-Seub;Kim, Dal-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제21권2호
    • /
    • pp.379-385
    • /
    • 2010
  • This paper considers Bayesian approach to modeling a flexible regression function under functional measurement error model. The regression function is modeled based on semiparametric regression with penalized splines. Model fitting and parameter estimation are carried out in a hierarchical Bayesian framework using Markov chain Monte Carlo methodology. Their performances are compared with those of the estimators under functional measurement error model without semiparametric component.

Bayesian Curve-Fitting in Semiparametric Small Area Models with Measurement Errors

  • Hwang, Jinseub;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제22권4호
    • /
    • pp.349-359
    • /
    • 2015
  • We study a semiparametric Bayesian approach to small area estimation under a nested error linear regression model with area level covariate subject to measurement error. Consideration is given to radial basis functions for the regression spline and knots on a grid of equally spaced sample quantiles of covariate with measurement errors in the nested error linear regression model setup. We conduct a hierarchical Bayesian structural measurement error model for small areas and prove the propriety of the joint posterior based on a given hierarchical Bayesian framework since some priors are defined non-informative improper priors that uses Markov Chain Monte Carlo methods to fit it. Our methodology is illustrated using numerical examples to compare possible models based on model adequacy criteria; in addition, analysis is conducted based on real data.

Negative Binomial Varying Coefficient Partially Linear Models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.809-817
    • /
    • 2012
  • We propose a semiparametric inference for a generalized varying coefficient partially linear model(VCPLM) for negative binomial data. The VCPLM is useful to model real data in that varying coefficients are a special type of interaction between explanatory variables and partially linear models fit both parametric and nonparametric terms. The negative binomial distribution often arise in modelling count data which usually are overdispersed. The varying coefficient function estimators and regression parameters in generalized VCPLM are obtained by formulating a penalized likelihood through smoothing splines for negative binomial data when the shape parameter is known. The performance of the proposed method is then evaluated by simulations.

Semiparametric Regression Splines in Matched Case-Control Studies

  • Kim, In-Young;Carroll, Raymond J.;Cohen, Noah
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2003년도 춘계 학술발표회 논문집
    • /
    • pp.167-170
    • /
    • 2003
  • We develop semiparametric methods for matched case-control studies using regression splines. Three methods are developed: an approximate crossvalidation scheme to estimate the smoothing parameter inherent in regression splines, as well as Monte Carlo Expectation Maximization (MCEM) and Bayesian methods to fit the regression spline model. We compare the approximate cross-validation approach, MCEM and Bayesian approaches using simulation, showing that they appear approximately equally efficient, with the approximate cross-validation method being computationally the most convenient. An example from equine epidemiology that motivated the work is used to demonstrate our approaches.

  • PDF

Bayesian curve-fitting with radial basis functions under functional measurement error model

  • Hwang, Jinseub;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권3호
    • /
    • pp.749-754
    • /
    • 2015
  • This article presents Bayesian approach to regression splines with knots on a grid of equally spaced sample quantiles of the independent variables under functional measurement error model.We consider small area model by using penalized splines of non-linear pattern. Specifically, in a basis functions of the regression spline, we use radial basis functions. To fit the model and estimate parameters we suggest a hierarchical Bayesian framework using Markov Chain Monte Carlo methodology. Furthermore, we illustrate the method in an application data. We check the convergence by a potential scale reduction factor and we use the posterior predictive p-value and the mean logarithmic conditional predictive ordinate to compar models.

Efficient estimation and variable selection for partially linear single-index-coefficient regression models

  • Kim, Young-Ju
    • Communications for Statistical Applications and Methods
    • /
    • 제26권1호
    • /
    • pp.69-78
    • /
    • 2019
  • A structured model with both single-index and varying coefficients is a powerful tool in modeling high dimensional data. It has been widely used because the single-index can overcome the curse of dimensionality and varying coefficients can allow nonlinear interaction effects in the model. For high dimensional index vectors, variable selection becomes an important question in the model building process. In this paper, we propose an efficient estimation and a variable selection method based on a smoothing spline approach in a partially linear single-index-coefficient regression model. We also propose an efficient algorithm for simultaneously estimating the coefficient functions in a data-adaptive lower-dimensional approximation space and selecting significant variables in the index with the adaptive LASSO penalty. The empirical performance of the proposed method is illustrated with simulated and real data examples.

경시적 자료의 주의력 결핍 과잉행동 장애를 종점으로 한 납의 벤치마크 용량 하한 도출 (Derivation of a benchmark dose lower bound of lead for attention deficit hyperactivity disorder using a longitudinal data set)

  • 이주형;김시연;하미나;권호장;김병수
    • 응용통계연구
    • /
    • 제29권7호
    • /
    • pp.1295-1309
    • /
    • 2016
  • 본 연구의 목적은 아동 건강에 미치는 환경의 영향을 평가하기 위하여 우리나라 환경부에서 구축한 경시적 자료인 CHEER 자료를 바탕으로 납의 벤치마크 용량 하한(BMDL)을 도출하여 Kim 등 (2014)의 결과를 재현하는 것이다. 본 연구에서는 CHEER 자료의 2005년 동집단을 사용하였는데, 벌점화 선형 스플라인을 이용한 변환공식으로 2005년 동집단의 ADHD 평가 척도를 통일하고, 경시적 자료의 특성을 반영한 두 개의 선형혼합모형을 구축하였다. 이후 구축된 모형을 바탕으로 혈중 납 농도의 BMDL을 도출하였다. 이 과정에서 Kim 등 (2014)에서 발견한 ADHD 점수의 평균으로의 회귀 현상이 재확인되었고, 2005년 동집단과 2006년 동집단의 분포 상의 특징적 차이가 발견되었다. 결과적으로 이 차이를 감안했을 때, Kim 등 (2014)과 일치적인 결과를 얻을 수 있었다.