• Title/Summary/Keyword: Variables selection

Search Result 1,186, Processing Time 0.024 seconds

Variable selection for multiclassi cation by LS-SVM

  • Hwang, Hyung-Tae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.5
    • /
    • pp.959-965
    • /
    • 2010
  • For multiclassification, it is often the case that some variables are not important while some variables are more important than others. We propose a novel algorithm for selecting such relevant variables for multiclassification. This algorithm is base on multiclass least squares support vector machine (LS-SVM), which uses results of multiclass LS-SVM using one-vs-all method. Experimental results are then presented which indicate the performance of the proposed method.

Fishery Products Processed Food Research for Reference of Selection and Pursuable Benefit of Fishery Products Processed Food (소비자의 추구혜택에 따른 수산물 가공식품의 선택속성에 관한 연구)

  • Kim, Sung-Jong;Ha, Kyu-Soo
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2010.08a
    • /
    • pp.93-112
    • /
    • 2010
  • Consumers show higher interest in fishery products processed food that are effective for the personal health and good for convenience, nourishment and taste. But current domestic research for fishery products processed food is marginal. In this respect, this research systematically analyzes consumers' consumption patterns and relationship to comsumer's pursuable benefit, reference for selection, satisfaction level and purpose of purchase. This research shows results as following. Consumers consider product information the most important in reference for selection, and convinience the highest in pursuable benefit. And this research analyze influence of reference for selection and pursuable benefit on satisfaction level and purpose of purchase using demographic properties as control variables. The variables which affect satisfaction level are residential district(region), recipe, nutrient, convenience, economy and the variables affect purpose of purchase are nutrient, convenienct, satisfaction level. If this result is used to develop new products and industrialize fishery products processed food, consumer market of fishery products processed food can be expanded. And this result can be utilized as fundamental reference for sales promotion.

  • PDF

Design of Space Search-Optimized Polynomial Neural Networks with the Aid of Ranking Selection and L2-norm Regularization

  • Wang, Dan;Oh, Sung-Kwun;Kim, Eun-Hu
    • Journal of Electrical Engineering and Technology
    • /
    • v.13 no.4
    • /
    • pp.1724-1731
    • /
    • 2018
  • The conventional polynomial neural network (PNN) is a classical flexible neural structure and self-organizing network, however it is not free from the limitation of overfitting problem. In this study, we propose a space search-optimized polynomial neural network (ssPNN) structure to alleviate this problem. Ranking selection is realized by means of ranking selection-based performance index (RS_PI) which is combined with conventional performance index (PI) and coefficients based performance index (CPI) (viz. the sum of squared coefficient). Unlike the conventional PNN, L2-norm regularization method for estimating the polynomial coefficients is also used when designing the ssPNN. Furthermore, space search optimization (SSO) is exploited here to optimize the parameters of ssPNN (viz. the number of input variables, which variables will be selected as input variables, and the type of polynomial). Experimental results show that the proposed ranking selection-based polynomial neural network gives rise to better performance in comparison with the neuron fuzzy models reported in the literatures.

On an Optimal Bayesian Variable Selection Method for Generalized Logit Model

  • Kim, Hea-Jung;Lee, Ae Kuoung
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.2
    • /
    • pp.617-631
    • /
    • 2000
  • This paper is concerned with suggesting a Bayesian method for variable selection in generalized logit model. It is based on Laplace-Metropolis algorithm intended to propose a simple method for estimating the marginal likelihood of the model. The algorithm then leads to a criterion for the selection of variables. The criterion is to find a subset of variables that maximizes the marginal likelihood of the model and it is seen to be a Bayes rule in a sense that it minimizes the risk of the variable selection under 0-1 loss function. Based upon two examples, the suggested method is illustrated and compared with existing frequentist methods.

  • PDF

Variable selection in Poisson HGLMs using h-likelihoood

  • Ha, Il Do;Cho, Geon-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1513-1521
    • /
    • 2015
  • Selecting relevant variables for a statistical model is very important in regression analysis. Recently, variable selection methods using a penalized likelihood have been widely studied in various regression models. The main advantage of these methods is that they select important variables and estimate the regression coefficients of the covariates, simultaneously. In this paper, we propose a simple procedure based on a penalized h-likelihood (HL) for variable selection in Poisson hierarchical generalized linear models (HGLMs) for correlated count data. For this we consider three penalty functions (LASSO, SCAD and HL), and derive the corresponding variable-selection procedures. The proposed method is illustrated using a practical example.

A New Calibration Method Based on the Recursive Linear Regression with Variables Selection

  • Park, Kwang-Su;Jun, Chi-Hyuck
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1241-1241
    • /
    • 2001
  • We propose a new calibration method, which uses the linearization method for spectral responses and the repetitive adoptions of the linearization weight matrices to construct a frature. Weight matrices are estimated through multiple linear regression (or principal component regression or partial least squares) with forward variable selection. The proposed method is applied to three data sets. The first is FTIR spectral data set for FeO content from sinter process and the second is NIR spectra from trans-alkylation process having two constituent variables. The third is NIR spectra of crude oil with three physical property variables. To see the calibration performance, we compare the new method with the PLS. It is found that the new method gives a little better performance than the PLS and the calibration result is stable in spite of the collinearity among each selected spectral responses. Furthermore, doing the repetitive adoptions of linearization matrices in the proposed methods, uninformative variables are disregarded. That is, the new methods include the effect of variables subset selection, simultaneously.

  • PDF

Predicting the Number of Movie Audiences Through Variable Selection Based on Information Gain Measure (정보 소득율 기반의 변수 선택을 통한 영화 관객 수 예측)

  • Park, Hyeon-Mock;Choi, Sang Hyun
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.3
    • /
    • pp.19-27
    • /
    • 2019
  • In this study, we propose a methodology for predicting the movie audience based on movie information that can be easily acquired before opening and effectively distinguishing qualitative variables. In addition, we constructed a model to estimate the number of movie audiences at the time of data acquisition through the configured variables. Another purpose of this study is to provide a criterion for categorizing success of movies with qualitative characteristics. As an evaluation criterion, we used information gain ratio which is the node selection criterion of C4.5 algorithm. Through the procedure we have selected 416 movie data features. As a result of the multiple linear regression model, the performance of the regression model using the variables selection method based on the information gain ratio was excellent.

An Empirical Investigation into the Factors Influencing Shopping Mall Selection Decisions in the Cyber Shopping Environment (사이버 쇼핑 환경에서 소비자의 쇼핑몰 선택에 영향을 미치는 요인에 관한 연구)

  • Kim, Jong-Uk
    • The Journal of Information Systems
    • /
    • v.14 no.1
    • /
    • pp.171-195
    • /
    • 2005
  • The current study investigates major factors which influence the consumer's selection of internet shopping malls. Based on the technology acceptance model(TAM)(Davis, 1989) and trust theory(McKnight & Chervany, 2001), consumer selection factors from marketing research(Burke, 1997;Dodds et al, 2001), perceived usefulness, perceived ease of use, trust, service quality, and product price were hypothesized as to affect the consumer's decision to choose one's specific internet shopping malls. The study developed a research model to explain the shopping mall selection and collected the survey responses from 312 internet shopping mall users. The results of the current research indicate that all the research variables employed in the study, perceived usefulness, perceived ease of use, trust, service quality, and product price, are found to significantly influence the consumer's shopping mall selection decision. Among the influencing factors, price, service quality, and trust showed a greater effect on the shopping mall selection than usefulness and ease of use. This result implies that purchase-related variables such as price and service quality may be more critical to attracting customers and thereby raising the sales volume of the shopping mall, than the web site's usefulness and ease of use.

  • PDF

A Method for Screening Product Design Variables for Building A Usability Model : Genetic Algorithm Approach (사용편의성 모델수립을 위한 제품 설계 변수의 선별방법 : 유전자 알고리즘 접근방법)

  • Yang, Hui-Cheol;Han, Seong-Ho
    • Journal of the Ergonomics Society of Korea
    • /
    • v.20 no.1
    • /
    • pp.45-62
    • /
    • 2001
  • This study suggests a genetic algorithm-based partial least squares (GA-based PLS) method to select the design variables for building a usability model. The GA-based PLS uses a genetic algorithm to minimize the root-mean-squared error of a partial least square regression model. A multiple linear regression method is applied to build a usability model that contains the variables seleded by the GA-based PLS. The performance of the usability model turned out to be generally better than that of the previous usability models using other variable selection methods such as expert rating, principal component analysis, cluster analysis, and partial least squares. Furthermore, the model performance was drastically improved by supplementing the category type variables selected by the GA-based PLS in the usability model. It is recommended that the GA-based PLS be applied to the variable selection for developing a usability model.

  • PDF

Application of Variable Selection for Prediction of Target Concentration

  • 김선우;김연주;김종원;윤길원
    • Bulletin of the Korean Chemical Society
    • /
    • v.20 no.5
    • /
    • pp.525-527
    • /
    • 1999
  • Many types of chemical data tend to be characterized by many measured variables on each of a few observations. In this situation, target concentration can be predicted using multivariate statistical modeling. However, it is necessary to use a few variables considering size and cost of instrumentation, for an example, for development of a portable biomedical instrument. This study presents, with a spectral data set of total hemoglobin in whole blood, the possibility that modeling using only a few variables can improve predictability compared to modeling using all of the variables. Predictability from the model using three wavelengths selected from all possible regression method was improved, compared to the model using whole spectra (whole spectra: SEP = 0.4 g/dL, 3-wavelengths: SEP=0.3 g/dL). It appears that the proper selection of variables can be more effective than using whole spectra for determining the hemoglobin concentration in whole blood.