• Title/Summary/Keyword: cross-validation method

Search Result 498, Processing Time 0.034 seconds

GLOBAL MINIMA OF LEAST SQUARES CROSS VALIDATION FOR A SYMMETRIC POLYNOMIAL KEREL WITH FINITE SUPPORT

  • Jung, Kang-Mo;Kim, Byung-Chun
    • Journal of applied mathematics & informatics
    • /
    • v.3 no.2
    • /
    • pp.183-192
    • /
    • 1996
  • The least squares cross validated bandwidth is the mini-mizer of the corss validation function for choosing the smooth parame-ter of a kernel density estimator. It is a completely automatic method but it requires inordinate amounts of computational time. We present a convenient formula for calculation of the cross validation function when the kernel function is a symmetric polynomial with finite sup-port. Also we suggest an algorithm for finding global minima of the crass validation function.

SVM Load Forecasting using Cross-Validation (교차검증을 이용한 SVM 전력수요예측)

  • Jo, Nam-Hoon
    • The Transactions of the Korean Institute of Electrical Engineers A
    • /
    • v.55 no.11
    • /
    • pp.485-491
    • /
    • 2006
  • In this paper, we study the problem of model selection for Support Vector Machine(SVM) predictor for short-term load forecasting. The model selection amounts to tuning SVM parameters, such as the cost coefficient C and kernel parameters and so on, in order to maximize the prediction performance of SVM. We propose that Cross-Validation method can be used as a model selection algorithm for SVM-based load forecasting technique. Through the various experiments on several data sets, we found that the difference between the prediction error of SVM using Cross-Validation and that of ideal SVM is less than 5%. This shows that SVM parameters for load forecasting can be efficiently tuned by using Cross-Validation.

Robust Cross Validation Score

  • Park, Dong-Ryeon
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.413-423
    • /
    • 2005
  • Consider the problem of estimating the underlying regression function from a set of noisy data which is contaminated by a long tailed error distribution. There exist several robust smoothing techniques and these are turned out to be very useful to reduce the influence of outlying observations. However, no matter what kind of robust smoother we use, we should choose the smoothing parameter and relatively less attention has been made for the robust bandwidth selection method. In this paper, we adopt the idea of robust location parameter estimation technique and propose the robust cross validation score functions.

Validation Technique using variance and confidence interval of metamodel (근사모델의 분산과 신뢰구간을 이용한 모델의 정확도 평가법)

  • Han, In-Sik;Lee, Yong-Bin;Choi, Dong-Hoon
    • Proceedings of the KSME Conference
    • /
    • 2008.11a
    • /
    • pp.1169-1175
    • /
    • 2008
  • The validation technique is classified with two methods whether to demand of additional experimental points. The method which requires additional experimental points such as RSME is actually impossible in engineering field. Therefore, the method which only use experimented points such as the cross validation technique is only available. But the cross validation not only requires considerable computational costs for generating metamodel each iterations, but also cannot measure quantitatively the fidelity of metamodel. In this research we propose a new validation technique for representative metamodels using an variance of metamodel and confidence interval information. The proposed validation technique computes confidence intervals using a variance information from the metamodel. This technique will have influence on choosing the accurate metamodel, constructing ensemble of each metamodels and advancing effectively sequential sampling technique.

  • PDF

A Study on Accuracy Estimation of Service Model by Cross-validation and Pattern Matching

  • Cho, Seongsoo;Shrestha, Bhanu
    • International journal of advanced smart convergence
    • /
    • v.6 no.3
    • /
    • pp.17-21
    • /
    • 2017
  • In this paper, the service execution accuracy was compared by ontology based rule inference method and machine learning method, and the amount of data at the point when the service execution accuracy of the machine learning method becomes equal to the service execution accuracy of the rule inference was found. The rule inference, which measures service execution accuracy and service execution accuracy using accumulated data and pattern matching on service results. And then machine learning method measures service execution accuracy using cross validation data. After creating a confusion matrix and measuring the accuracy of each service execution, the inference algorithm can be selected from the results.

Estimation of daily maximum air temperature using NOAA/AVHRR data (NOAA/AVHRR 자료를 이용한 일 최고기온 추정에 관한 연구)

  • 변민정;한영호;김영섭
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2003.04a
    • /
    • pp.291-296
    • /
    • 2003
  • This study estimated surface temperature by using split-window technique and NOAA/AVHRR data was used. For surface monitoring, cloud masking procedure was carried out using threshold algorithm. The daily maximum air temperature is estimated by multiple regression method using independent variables such as satellite-derived surface temperature, EDD, and latitude. When the EDD data added, the highest correlation shown. This indicates that EDD data is the necessary element for estimation of the daily maximum air temperature. We derived correlation and experience equation by three approaching method to estimate daily maximum air temperature. 1) non-considering landcover method as season, 2) considering landcover method as season, and 3) just method as landcover. The last approaching method shows the highest correlation. So cross-validation procedure was used in third method for validation of the estimated value. For all landcover type 5, the results using the cross-validation procedure show reasonable agreement with measured values(slope=0.97, intercept=-0.30, R$^2$=0.84, RMSE=4.24$^{\circ}C$). Also, for all landcover type 7, the results using the cross-validation procedure show reasonable agreement with measured values(slope=0.993, Intercept=0.062, R$^2$=0.84, RMSE=4.43$^{\circ}C$).

  • PDF

LS-SVM for large data sets

  • Park, Hongrak;Hwang, Hyungtae;Kim, Byungju
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.2
    • /
    • pp.549-557
    • /
    • 2016
  • In this paper we propose multiclassification method for large data sets by ensembling least squares support vector machines (LS-SVM) with principal components instead of raw input vector. We use the revised one-vs-all method for multiclassification, which is one of voting scheme based on combining several binary classifications. The revised one-vs-all method is performed by using the hat matrix of LS-SVM ensemble, which is obtained by ensembling LS-SVMs trained using each random sample from the whole large training data. The leave-one-out cross validation (CV) function is used for the optimal values of hyper-parameters which affect the performance of multiclass LS-SVM ensemble. We present the generalized cross validation function to reduce computational burden of leave-one-out CV functions. Experimental results from real data sets are then obtained to illustrate the performance of the proposed multiclass LS-SVM ensemble.

PRECONDITIONED GL-CGLS METHOD USING REGULARIZATION PARAMETERS CHOSEN FROM THE GLOBAL GENERALIZED CROSS VALIDATION

  • Oh, SeYoung;Kwon, SunJoo
    • Journal of the Chungcheong Mathematical Society
    • /
    • v.27 no.4
    • /
    • pp.675-688
    • /
    • 2014
  • In this paper, we present an efficient way to determine a suitable value of the regularization parameter using the global generalized cross validation and analyze the experimental results from preconditioned global conjugate gradient linear least squares(Gl-CGLS) method in solving image deblurring problems. Preconditioned Gl-CGLS solves general linear systems with multiple right-hand sides. It has been shown in [10] that this method can be effectively applied to image deblurring problems. The regularization parameter, chosen from the global generalized cross validation, with preconditioned Gl-CGLS method can give better reconstructions of the true image than other parameters considered in this study.

On Practical Choice of Smoothing Parameter in Nonparametric Classification (베이즈 리스크를 이용한 커널형 분류에서 평활모수의 선택)

  • Kim, Rae-Sang;Kang, Kee-Hoon
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.283-292
    • /
    • 2008
  • Smoothing parameter or bandwidth plays a key role in nonparametric classification based on kernel density estimation. We consider choosing smoothing parameter in nonparametric classification, which optimize the Bayes risk. Hall and Kang (2005) clarified the theoretical properties of smoothing parameter in terms of minimizing Bayes risk and derived the optimal order of it. Bootstrap method was used in their exploring numerical properties. We compare cross-validation and bootstrap method numerically in terms of optimal order of bandwidth. Effects on misclassification rate are also examined. We confirm that bootstrap method is superior to cross-validation in both cases.

Kernel method for autoregressive data

  • Shim, Joo-Yong;Lee, Jang-Taek
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.5
    • /
    • pp.949-954
    • /
    • 2009
  • The autoregressive process is applied in this paper to kernel regression in order to infer nonlinear models for predicting responses. We propose a kernel method for the autoregressive data which estimates the mean function by kernel machines. We also present the model selection method which employs the cross validation techniques for choosing the hyper-parameters which affect the performance of kernel regression. Artificial and real examples are provided to indicate the usefulness of the proposed method for the estimation of mean function in the presence of autocorrelation between data.

  • PDF