• Title/Summary/Keyword: Variables selection

Search Result 1,200, Processing Time 0.022 seconds

Use of Artificial Bee Swarm Optimization (ABSO) for Feature Selection in System Diagnosis for Coronary Heart Disease

  • Wiharto;Yaumi A. Z. A. Fajri;Esti Suryani;Sigit Setyawan
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.2
    • /
    • pp.130-138
    • /
    • 2023
  • The selection of the correct examination variables for diagnosing heart disease provides many benefits, including faster diagnosis and lower cost of examination. The selection of inspection variables can be performed by referring to the data of previous examination results so that future investigations can be carried out by referring to these selected variables. This paper proposes a model for selecting examination variables using an Artificial Bee Swarm Optimization method by considering the variables of accuracy and cost of inspection. The proposed feature selection model was evaluated using the performance parameters of accuracy, area under curve (AUC), number of variables, and inspection cost. The test results show that the proposed model can produce 24 examination variables and provide 95.16% accuracy and 97.61% AUC. These results indicate a significant decrease in the number of inspection variables and inspection costs while maintaining performance in the excellent category.

A study on the Predictors of criteria on Clothing Selection (의복선택기준 예측변인 연구)

  • Shin, Jeong-Won;Park, Eun-Joo
    • Journal of the Korean Society of Costume
    • /
    • v.13
    • /
    • pp.123-134
    • /
    • 1989
  • The purpose of this study was to identify the predictable variables of criteria on clothing selection. Relationships among criteria on clothing selection, psychological variable, lifestyle variable, and demographic variable were tested by Pearsons' correlation coefficients and One-way ANOVA. The predictors of criteria on clothing selection were identified by Regression. The consumers were classified into several benefit-segments by criteria on clothing selection, and then, the character of each segment were identified by Multiple Discriminant Analysis. Data was obtained from 593 women living in Pusan by self-administered questionnaires. The results of the study were as follows; 1. Relationship between criteria on clothing selection and relative variables. 1) The important variables to criteria on clothing selection were "down-to-earth-sophisticated", "traditional-morden", "conventional-different", "conscientious-expendient", need for exhibitionism, need for sex, fashion / appearance. 2) The important factor of clothing selection criteria was comfort and it has significant difference among ages. 3) The higher of social-economic status have the more appearance-oriented selection. 2. Predictors of criteria on clothing selection. There were several important predictors of criteria on clothing selection like lifestyle, need, and self-image. Especially, fashion / appearance in lifestyle variable was very important. 3. Segmentation by the criteria on clothing selection. There are four groups Classified by the criteria on clothing selection, that is practical-oriented group, appearance-oriented group, practical and appearance-oriented group, and indifference group. The significant discriminative variables were Fashion / appearance factor, need for exhibitionism, and need for sex. The result of this study can be used for a enterprise to analysis the consumer and to build the strategy of advertisement clothing.

  • PDF

Variable Selection and Outlier Detection for Automated K-means Clustering

  • Kim, Sung-Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.1
    • /
    • pp.55-67
    • /
    • 2015
  • An important problem in cluster analysis is the selection of variables that define cluster structure that also eliminate noisy variables that mask cluster structure; in addition, outlier detection is a fundamental task for cluster analysis. Here we provide an automated K-means clustering process combined with variable selection and outlier identification. The Automated K-means clustering procedure consists of three processes: (i) automatically calculating the cluster number and initial cluster center whenever a new variable is added, (ii) identifying outliers for each cluster depending on used variables, (iii) selecting variables defining cluster structure in a forward manner. To select variables, we applied VS-KM (variable-selection heuristic for K-means clustering) procedure (Brusco and Cradit, 2001). To identify outliers, we used a hybrid approach combining a clustering based approach and distance based approach. Simulation results indicate that the proposed automated K-means clustering procedure is effective to select variables and identify outliers. The implemented R program can be obtained at http://www.knou.ac.kr/~sskim/SVOKmeans.r.

A Variable Selection Procedure for K-Means Clustering

  • Kim, Sung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.3
    • /
    • pp.471-483
    • /
    • 2012
  • One of the most important problems in cluster analysis is the selection of variables that truly define cluster structure, while eliminating noisy variables that mask such structure. Brusco and Cradit (2001) present VS-KM(variable-selection heuristic for K-means clustering) procedure for selecting true variables for K-means clustering based on adjusted Rand index. This procedure starts with the fixed number of clusters in K-means and adds variables sequentially based on an adjusted Rand index. This paper presents an updated procedure combining the VS-KM with the automated K-means procedure provided by Kim (2009). This automated variable selection procedure for K-means clustering calculates the cluster number and initial cluster center whenever new variable is added and adds a variable based on adjusted Rand index. Simulation result indicates that the proposed procedure is very effective at selecting true variables and at eliminating noisy variables. Implemented program using R can be obtained on the website "http://faculty.knou.ac.kr/sskim/nvarkm.r and vnvarkm.r".

A Study of Users' Cognitive Characteristics Influencing upon the Usage of End-User Searching Systems (최종이용자탐색시스템의 이용과 이용자의 인지적 특성간의 관계 연구)

  • Lee Sang-Bok
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.27
    • /
    • pp.291-339
    • /
    • 1994
  • The purpose of this study is to find personal characteristics that affect users' cognitive characteristics of system, and to verify correlations between this users' cognitive characteristics and selection of system usage in using end -user searching systems (EUSS), For corroborative analysis of this study, preliminary model was constructed referring to Davis' Technology Acceptance Model. The model consists of exogenous variables (personal characteristics) , parameter variables (perceived usefulness, perceived ease of use), and effect variables (selection of system usage), When exogenous variables affect parameter variables, exogenous variables are independent variables and parameter variables are dependent variables. In addition, in correlation of parameter variables, which have been affected by exogenous variables, with effect variables, parameter variables are independent variables and effect variables are dependent variables, As for the research methodology, this study regards the Academic Information System connected with the Internet as EUSS, So questionnaires have been sent to researchers in universities who were conducting direct searching for the system. 229 valid responses to questionnaires have been analyzed according to Pearson Correlation Analysis and Stepwise Selection of Multiple Regression in the statistical software packages, 'SPSS PC+'. The findings and conclusions made in this study are summarized as follows; 1. Among the personal characteristics (age, disciplinary, computer literacy level, perceived usefulness of use education and training, perceived satisfaction of end-user searching, perceived satisfaction of system characteristics), all characteristics but age affect perceived usefulness and perceived ease of use. Specifically, perceived satisfaction of end user searching and perceived satisfaction of system characteristics most affect perceived usefulness and perceived ease of use respectively. 2. Perceived usefulness and perceived ease of use have a direct effect on selection of system usage in using EUSS. 3, Perceived usefulness more affect selection of system usage than perceived ease of use in using EUSS.

  • PDF

A Study on the Relationship between Clothing Selection Behavior and Personal variables of Adult Women (성인여성의 의복선택행동과 관련변인연구 -자아개념을 중심으로-)

  • Kim So-Yeun;Cho Phil-Gyo
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.12 no.2 s.27
    • /
    • pp.159-167
    • /
    • 1988
  • The purpose of this study was to investigate the relationship between Self-concept, Personal variables and Clothing selection behavior. Self-concept was measured with Choi Jung Hun's 'Perceptual Orientation Scale' and Clothing selection behavior scale was prepared for this study. The questionnaire were completed by 389 women in Taegu. Statistical analysis was performed using F-test, Scheffe's test. The results were as follows; 1. There was significant relationship between Self-concept and Clothing selection behavior. (individuality, conformity, economy, modesty). 2. There was significant difference in clothing selection behavior variables according to age. 3. There was significant difference in individuality and economy according to marital status. 4. There was significant difference in individuality, economy and modesty according to education level. 5. There was significant difference in clothing selection behavior variables according to monthly clothing expenses.

  • PDF

A Study on Split Variable Selection Using Transformation of Variables in Decision Trees

  • Chung, Sung-S.;Lee, Ki-H.;Lee, Seung-S.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.2
    • /
    • pp.195-205
    • /
    • 2005
  • In decision tree analysis, C4.5 and CART algorithm have some problems of computational complexity and bias on variable selection. But QUEST algorithm solves these problems by dividing the step of variable selection and split point selection. When input variables are continuous, QUEST algorithm uses ANOVA F-test under the assumption of normality and homogeneity of variances. In this paper, we investigate the influence of violation of normality assumption and effect of the transformation of variables in the QUEST algorithm. In the simulation study, we obtained the empirical powers of variable selection and the empirical bias of variable selection after transformation of variables having various type of underlying distributions.

  • PDF

Discretization Method Based on Quantiles for Variable Selection Using Mutual Information

  • CHa, Woon-Ock;Huh, Moon-Yul
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.3
    • /
    • pp.659-672
    • /
    • 2005
  • This paper evaluates discretization of continuous variables to select relevant variables for supervised learning using mutual information. Three discretization methods, MDL, Histogram and 4-Intervals are considered. The process of discretization and variable subset selection is evaluated according to the classification accuracies with the 6 real data sets of UCI databases. Results show that 4-Interval discretization method based on quantiles, is robust and efficient for variable selection process. We also visually evaluate the appropriateness of the selected subset of variables.

Robust Variable Selection in Classification Tree

  • Jang Jeong Yee;Jeong Kwang Mo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.89-94
    • /
    • 2001
  • In this study we focus on variable selection in decision tree growing structure. Some of the splitting rules and variable selection algorithms are discussed. We propose a competitive variable selection method based on Kruskal-Wallis test, which is a nonparametric version of ANOVA F-test. Through a Monte Carlo study we note that CART has serious bias in variable selection towards categorical variables having many values, and also QUEST using F-test is not so powerful to select informative variables under heavy tailed distributions.

  • PDF

Analysis of mixture experimental data with process variables (공정변수를 갖는 혼합물 실험 자료의 분석)

  • Lim, Yong-B.
    • Journal of Korean Society for Quality Management
    • /
    • v.40 no.3
    • /
    • pp.347-358
    • /
    • 2012
  • Purpose: Given the mixture components - process variables experimental data, we propose the strategy to find the proper combined model. Methods: Process variables are factors in an experiment that are not mixture components but could affect the blending properties of the mixture ingredients. For example, the effectiveness of an etching solution which is measured as an etch rate is not only a function of the proportions of the three acids that are combined to form the mixture, but also depends on the temperature of the solution and the agitation rate. Efficient designs for the mixture components - process variables experiments depend on the mixture components - process variables model which is called a combined model. We often use the product model between the canonical polynomial model for the mixture and process variables model as a combined model. Results: First we choose the reasonable starting models among the class of admissible product models and practical combined models suggested by Lim(2011) based on the model selection criteria and then, search for candidate models which are subset models of the starting model by the sequential variables selection method or all possible regressions procedure. Conclusion: Good candidate models are screened by the evaluation of model selection criteria and checking the residual plots for the validity of the model assumption. The strategy to find the proper combined model is illustrated with examples in this paper.