• Title/Summary/Keyword: nonparametric model selection

Search Result 23, Processing Time 0.02 seconds

Selection of Data-adaptive Polynomial Order in Local Polynomial Nonparametric Regression

  • Jo, Jae-Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.1
    • /
    • pp.177-183
    • /
    • 1997
  • A data-adaptive order selection procedure is proposed for local polynomial nonparametric regression. For each given polynomial order, bias and variance are estimated and the adaptive polynomial order that has the smallest estimated mean squared error is selected locally at each location point. To estimate mean squared error, empirical bias estimate of Ruppert (1995) and local polynomial variance estimate of Ruppert, Wand, Wand, Holst and Hossjer (1995) are used. Since the proposed method does not require fitting polynomial model of order higher than the model order, it is simpler than the order selection method proposed by Fan and Gijbels (1995b).

  • PDF

Simultaneous outlier detection and variable selection via difference-based regression model and stochastic search variable selection

  • Park, Jong Suk;Park, Chun Gun;Lee, Kyeong Eun
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.2
    • /
    • pp.149-161
    • /
    • 2019
  • In this article, we suggest the following approaches to simultaneous variable selection and outlier detection. First, we determine possible candidates for outliers using properties of an intercept estimator in a difference-based regression model, and the information of outliers is reflected in the multiple regression model adding mean shift parameters. Second, we select the best model from the model including the outlier candidates as predictors using stochastic search variable selection. Finally, we evaluate our method using simulations and real data analysis to yield promising results. In addition, we need to develop our method to make robust estimates. We will also to the nonparametric regression model for simultaneous outlier detection and variable selection.

Estimating dose-response curves using splines: a nonparametric Bayesian knot selection method

  • Lee, Jiwon;Kim, Yongku;Kim, Young Min
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.3
    • /
    • pp.287-299
    • /
    • 2022
  • In radiation epidemiology, the excess relative risk (ERR) model is used to determine the dose-response relationship. In general, the dose-response relationship for the ERR model is assumed to be linear, linear-quadratic, linear-threshold, quadratic, and so on. However, since none of these functions dominate other functions for expressing the dose-response relationship, a Bayesian semiparametric method using splines has recently been proposed. Thus, we improve the Bayesian semiparametric method for the selection of the tuning parameters for splines as the number and location of knots using a Bayesian knot selection method. Equally spaced knots cannot capture the characteristic of radiation exposed dose distribution which is highly skewed in general. Therefore, we propose a nonparametric Bayesian knot selection method based on a Dirichlet process mixture model. Inference of the spline coefficients after obtaining the number and location of knots is performed in the Bayesian framework. We apply this approach to the life span study cohort data from the radiation effects research foundation in Japan, and the results illustrate that the proposed method provides competitive curve estimates for the dose-response curve and relatively stable credible intervals for the curve.

Empirical variogram for achieving the best valid variogram

  • Mahdi, Esam;Abuzaid, Ali H.;Atta, Abdu M.A.
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.547-568
    • /
    • 2020
  • Modeling the statistical autocorrelations in spatial data is often achieved through the estimation of the variograms, where the selection of the appropriate valid variogram model, especially for small samples, is crucial for achieving precise spatial prediction results from kriging interpolations. To estimate such a variogram, we traditionally start by computing the empirical variogram (traditional Matheron or robust Cressie-Hawkins or kernel-based nonparametric approaches). In this article, we conduct numerical studies comparing the performance of these empirical variograms. In most situations, the nonparametric empirical variable nearest-neighbor (VNN) showed better performance than its competitors (Matheron, Cressie-Hawkins, and Nadaraya-Watson). The analysis of the spatial groundwater dataset used in this article suggests that the wave variogram model, with hole effect structure, fitted to the empirical VNN variogram is the most appropriate choice. This selected variogram is used with the ordinary kriging model to produce the predicted pollution map of the nitrate concentrations in groundwater dataset.

Parametric nonparametric methods for estimating extreme value distribution (극단값 분포 추정을 위한 모수적 비모수적 방법)

  • Woo, Seunghyun;Kang, Kee-Hoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.1
    • /
    • pp.531-536
    • /
    • 2022
  • This paper compared the performance of the parametric method and the nonparametric method when estimating the distribution for the tail of the distribution with heavy tails. For the parametric method, the generalized extreme value distribution and the generalized Pareto distribution were used, and for the nonparametric method, the kernel density estimation method was applied. For comparison of the two approaches, the results of function estimation by applying the block maximum value model and the threshold excess model using daily fine dust public data for each observatory in Seoul from 2014 to 2018 are shown together. In addition, the area where high concentrations of fine dust will occur was predicted through the return level.

Feature selection in the semivarying coefficient LS-SVR

  • Hwang, Changha;Shim, Jooyong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.461-471
    • /
    • 2017
  • In this paper we propose a feature selection method identifying important features in the semivarying coefficient model. One important issue in semivarying coefficient model is how to estimate the parametric and nonparametric components. Another issue is how to identify important features in the varying and the constant effects. We propose a feature selection method able to address this issue using generalized cross validation functions of the varying coefficient least squares support vector regression (LS-SVR) and the linear LS-SVR. Numerical studies indicate that the proposed method is quite effective in identifying important features in the varying and the constant effects in the semivarying coefficient model.

Semiparametric Seasonal Cointegrating Rank Selection

  • Seong, Byeong-Chan;Ahn, Sung-K.;Ch, Sin-Sup
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.5
    • /
    • pp.791-797
    • /
    • 2011
  • This paper considers the issue of seasonal cointegrating rank selection by information criteria as the extension of Cheng and Phillips (2009). The method does not require the specification of lag length in vector autoregression, is convenient in empirical work, and is in a semiparametric context because it allows for a general short memory error component in the model with only lags related to error correction terms. Some limit properties of usual information criteria are given for the rank selection and small Monte Carlo simulations are conducted to evaluate the performances of the criteria.

Portfolio Selection for Socially Responsible Investment via Nonparametric Frontier Models

  • Jeong, Seok-Oh;Hoss, Andrew;Park, Cheolwoo;Kang, Kee-Hoon;Ryu, Youngjae
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.2
    • /
    • pp.115-127
    • /
    • 2013
  • This paper provides an effective stock portfolio screening tool for socially responsible investment (SRI) based upon corporate social responsibility (CSR) and financial performance. The proposed approach utilizes nonparametric frontier models. Data envelopment analysis (DEA) has been used to build SRI portfolios in a few previous works; however, we show that free disposal hull (FDH), a similar model that does not assume the convexity of the technology, yields superior results when applied to a stock universe of 253 Korean companies. Over a four-year time span (from 2006 to 2009) the portfolios selected by the proposed method consistently outperform those selected by DEA as well as the benchmark.

Variable selection in partial linear regression using the least angle regression (부분선형모형에서 LARS를 이용한 변수선택)

  • Seo, Han Son;Yoon, Min;Lee, Hakbae
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.937-944
    • /
    • 2021
  • The problem of selecting variables is addressed in partial linear regression. Model selection for partial linear models is not easy since it involves nonparametric estimation such as smoothing parameter selection and estimation for linear explanatory variables. In this work, several approaches for variable selection are proposed using a fast forward selection algorithm, least angle regression (LARS). The proposed procedures use t-test, all possible regressions comparisons or stepwise selection process with variables selected by LARS. An example based on real data and a simulation study on the performance of the suggested procedures are presented.

Penalized maximum likelihood estimation with symmetric log-concave errors and LASSO penalty

  • Seo-Young, Park;Sunyul, Kim;Byungtae, Seo
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.641-653
    • /
    • 2022
  • Penalized least squares methods are important tools to simultaneously select variables and estimate parameters in linear regression. The penalized maximum likelihood can also be used for the same purpose assuming that the error distribution falls in a certain parametric family of distributions. However, the use of a certain parametric family can suffer a misspecification problem which undermines the estimation accuracy. To give sufficient flexibility to the error distribution, we propose to use the symmetric log-concave error distribution with LASSO penalty. A feasible algorithm to estimate both nonparametric and parametric components in the proposed model is provided. Some numerical studies are also presented showing that the proposed method produces more efficient estimators than some existing methods with similar variable selection performance.