• Title/Summary/Keyword: Sample Vector

Search Result 270, Processing Time 0.025 seconds

A Note on Deconvolution Estimators when Measurement Errors are Normal

  • Lee, Sung-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.517-526
    • /
    • 2012
  • In this paper a support vector method is proposed for use when the sample observations are contaminated by a normally distributed measurement error. The performance of deconvolution density estimators based on the support vector method is explored and compared with kernel density estimators by means of a simulation study. An interesting result was that for the estimation of kurtotic density, the support vector deconvolution estimator with a Gaussian kernel showed a better performance than the classical deconvolution kernel estimator.

Iterative Support Vector Quantile Regression for Censored Data

  • Shim, Joo-Yong;Hong, Dug-Hun;Kim, Dal-Ho;Hwang, Chang-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.195-203
    • /
    • 2007
  • In this paper we propose support vector quantile regression (SVQR) for randomly right censored data. The proposed procedure basically utilizes iterative method based on the empirical distribution functions of the censored times and the sample quantiles of the observed variables, and applies support vector regression for the estimation of the quantile function. Experimental results we then presented to indicate the performance of the proposed procedure.

An Analysis for the Structural Variation in the Unemployment Rate and the Test for the Turning Point (실업률 변동구조의 분석과 전환점 진단)

  • Kim, Tae-Ho;Hwang, Sung-Hye;Lee, Young-Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.253-269
    • /
    • 2005
  • One of the basic assumptions of the regression models is that the parameter vector does not vary across sample observations. If the parameter vector is not constant for all observations in the sample, the statistical model is changed and the usual least squares estimators do not yield unbiased, consistent and efficient estimates. This study investigates the regression model with some or all parameters vary across partitions of the whole sample data when the model permits different response coefficients during unusual time periods. Since the usual test for overall homogeneity of regressions across partitions of the sample data does not explicitly identify the break points between the partitions, the testing the equality between subsets of coefficients in two or more linear regressions is generalized and combined with the test procedure to search the break point. The method is applied to find the possibility and the turning point of the structural change in the long-run unemployment rate in the usual static framework by using the regression model. The relationships between the variables included in the model are reexamined in the dynamic framework by using Vector Autoregression.

Sample-Adaptive Product Quantization and Design Algorithm (표본 적응 프러덕트 양자화와 설계 알고리즘)

  • 김동식;박섭형
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.12B
    • /
    • pp.2391-2400
    • /
    • 1999
  • Vector quantizer (VQ) is an efficient data compression technique for low bit rate applications. However, the major disadvantage of VQ is its encoding complexity which increases dramatically as the vector dimension and bit rate increase. Even though one can use a modified VQ to reduce the encoding complexity, it is nearly impossible to implement such a VQ at a high bit rate or for a large vector dimension because of the enormously large memory requirement for the codebook and the very large training sequence (TS) size. To overcome this difficulty, in this paper we propose a novel structurally constrained VQ for the high bit rate and the large vector dimension cases in order to obtain VQ-level performance. Furthermore, this VQ can be extended to the low bit rate applications. The proposed quantization scheme has a form of feed-forward adaptive quantizer with a short adaptation period. Hence, we call this quantization scheme sample-adaptive product quantizer (SAPQ). SAPQ can provide a 2 ~3dB improvement over the Lloyd-Max scalar quantizers.

  • PDF

LS-SVM for large data sets

  • Park, Hongrak;Hwang, Hyungtae;Kim, Byungju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.549-557
    • /
    • 2016
  • In this paper we propose multiclassification method for large data sets by ensembling least squares support vector machines (LS-SVM) with principal components instead of raw input vector. We use the revised one-vs-all method for multiclassification, which is one of voting scheme based on combining several binary classifications. The revised one-vs-all method is performed by using the hat matrix of LS-SVM ensemble, which is obtained by ensembling LS-SVMs trained using each random sample from the whole large training data. The leave-one-out cross validation (CV) function is used for the optimal values of hyper-parameters which affect the performance of multiclass LS-SVM ensemble. We present the generalized cross validation function to reduce computational burden of leave-one-out CV functions. Experimental results from real data sets are then obtained to illustrate the performance of the proposed multiclass LS-SVM ensemble.

Lindley Type Estimation with Constrains on the Norm

  • Baek, Hoh-Yoo;Han, Kyou-Hwan
    • Honam Mathematical Journal
    • /
    • v.25 no.1
    • /
    • pp.95-115
    • /
    • 2003
  • Consider the problem of estimating a $p{\times}1$ mean vector ${\theta}(p{\geq}4)$ under the quadratic loss, based on a sample $X_1,\;{\cdots}X_n$. We find an optimal decision rule within the class of Lindley type decision rules which shrink the usual one toward the mean of observations when the underlying distribution is that of a variance mixture of normals and when the norm $||{\theta}-{\bar{\theta}}1||$ is known, where ${\bar{\theta}}=(1/p)\sum_{i=1}^p{\theta}_i$ and 1 is the column vector of ones. When the norm is restricted to a known interval, typically no optimal Lindley type rule exists but we characterize a minimal complete class within the class of Lindley type decision rules. We also characterize the subclass of Lindley type decision rules that dominate the sample mean.

  • PDF

Lindley Type Estimators When the Norm is Restricted to an Interval

  • Baek, Hoh-Yoo;Lee, Jeong-Mi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.1027-1039
    • /
    • 2005
  • Consider the problem of estimating a $p{\times}1$ mean vector $\theta(p\geq4)$ under the quadratic loss, based on a sample $X_1$, $X_2$, $\cdots$, $X_n$. We find a Lindley type decision rule which shrinks the usual one toward the mean of observations when the underlying distribution is that of a variance mixture of normals and when the norm $\parallel\;{\theta}-\bar{{\theta}}1\;{\parallel}$ is restricted to a known interval, where $bar{{\theta}}=\frac{1}{p}\;\sum\limits_{i=1}^{p}{\theta}_i$ and 1 is the column vector of ones. In this case, we characterize a minimal complete class within the class of Lindley type decision rules. We also characterize the subclass of Lindley type decision rules that dominate the sample mean.

  • PDF

Enhancing Gene Expression Classification of Support Vector Machines with Generative Adversarial Networks

  • Huynh, Phuoc-Hai;Nguyen, Van Hoa;Do, Thanh-Nghi
    • Journal of information and communication convergence engineering
    • /
    • v.17 no.1
    • /
    • pp.14-20
    • /
    • 2019
  • Currently, microarray gene expression data take advantage of the sufficient classification of cancers, which addresses the problems relating to cancer causes and treatment regimens. However, the sample size of gene expression data is often restricted, because the price of microarray technology on studies in humans is high. We propose enhancing the gene expression classification of support vector machines with generative adversarial networks (GAN-SVMs). A GAN that generates new data from original training datasets was implemented. The GAN was used in conjunction with nonlinear SVMs that efficiently classify gene expression data. Numerical test results on 20 low-sample-size and very high-dimensional microarray gene expression datasets from the Kent Ridge Biomedical and Array Expression repositories indicate that the model is more accurate than state-of-the-art classifying models.

AN APPROACH TO THE TRAINING OF A SUPPORT VECTOR MACHINE (SVM) CLASSIFIER USING SMALL MIXED PIXELS

  • Yu, Byeong-Hyeok;Chi, Kwang-Hoon
    • Proceedings of the KSRS Conference
    • /
    • 2008.10a
    • /
    • pp.386-389
    • /
    • 2008
  • It is important that the training stage of a supervised classification is designed to provide the spectral information. On the design of the training stage of a classification typically calls for the use of a large sample of randomly selected pure pixels in order to characterize the classes. Such guidance is generally made without regard to the specific nature of the application in-hand, including the classifier to be used. An approach to the training of a support vector machine (SVM) classifier that is the opposite of that generally promoted for training set design is suggested. This approach uses a small sample of mixed spectral responses drawn from purposefully selected locations (geographical boundaries) in training. A sample of such data should, however, be easier and cheaper to acquire than that suggested by traditional approaches. In this research, we evaluated them against traditional approaches with high-resolution satellite data. The results proved that it can be used small mixed pixels to derive a classification with similar accuracy using a large number of pure pixels. The approach can also reduce substantial costs in training data acquisition because the sampling locations used are commonly easy to observe.

  • PDF