• 제목/요약/키워드: Misclassification Rate

검색결과 67건 처리시간 0.029초

Modifying linearly non-separable support vector machine binary classifier to account for the centroid mean vector

  • Mubarak Al-Shukeili;Ronald Wesonga
    • Communications for Statistical Applications and Methods
    • /
    • 제30권3호
    • /
    • pp.245-258
    • /
    • 2023
  • This study proposes a modification to the objective function of the support vector machine for the linearly non-separable case of a binary classifier yi ∈ {-1, 1}. The modification takes into account the position of each data item xi from its corresponding class centroid. The resulting optimization function involves the centroid mean vector, and the spread of data besides the support vectors, which should be minimized by the choice of hyper-plane β. Theoretical assumptions have been tested to derive an optimal separable hyperplane that yields the minimal misclassification rate. The proposed method has been evaluated using simulation studies and real-life COVID-19 patient outcome hospitalization data. Results show that the proposed method performs better than the classical linear SVM classifier as the sample size increases and is preferred in the presence of correlations among predictors as well as among extreme values.

선형판별분석에서 MCMC다중대체법의 효율에 관한 연구 (A Study on the efficiency of the MCMC multiple imputation In LDA)

  • 유희경;김명철
    • 대한안전경영과학회지
    • /
    • 제11권3호
    • /
    • pp.189-198
    • /
    • 2009
  • This thesis studies two imputation methods, the MCMC method and the EM algorithm, that take care of the problem. The performance of the two methods for the linear (or quadratic) discriminant analysis are evaluated under various types of incomplete observations. Based on simulated experiments, the effect of the imputation using the EM algorithm and the MCMC method are evaluated and compared in terms of the probability of misclassification and the RMSE. This is done for the various cases of incomplete observations. The cases are differentiated by missing rates, sample sizes, and distances between two classification groups. The studies show that the probability of misclassification and the RMSE of the EM algorithm method is lower than the MCMC method. Therefore the imputation using the EM algorithm is more efficient than the MCMC method. And the probability of misclassification of the method that all vectors of observations with missing values are omitted from analysis is lower than the EM algorithm and the MCMC method when the samples size is small and the rate of missing values is extremely big.

딥뉴럴네트워크 상에 신속한 오인식 샘플 생성 공격 (Rapid Misclassification Sample Generation Attack on Deep Neural Network)

  • 권현;박상준;김용철
    • 융합보안논문지
    • /
    • 제20권2호
    • /
    • pp.111-121
    • /
    • 2020
  • 딥뉴럴네트워크는 머신러닝 분야 중 이미지 인식, 사물 인식 등에 좋은 성능을 보여주고 있다. 그러나 딥뉴럴네트워크는 적대적 샘플(Adversarial example)에 취약점이 있다. 적대적 샘플은 원본 샘플에 최소한의 noise를 넣어서 딥뉴럴네트워크가 잘못 인식하게 하는 샘플이다. 그러나 이러한 적대적 샘플은 원본 샘플간의 최소한의 noise을 주면서 동시에 딥뉴럴네트워크가 잘못 인식하도록 하는 샘플을 생성하는 데 시간이 많이 걸린다는 단점이 있다. 따라서 어떠한 경우에 최소한의 noise가 아니더라도 신속하게 딥뉴럴네트워크가 잘못 인식하도록 하는 공격이 필요할 수 있다. 이 논문에서, 우리는 신속하게 딥뉴럴네트워크를 공격하는 것에 우선순위를 둔 신속한 오인식 샘플 생성 공격을 제안하고자 한다. 이 제안방법은 원본 샘플에 대한 왜곡을 고려하지 않고 딥뉴럴네트워크의 오인식에 중점을 둔 noise를 추가하는 방식이다. 따라서 이 방법은 기존방법과 달리 별도의 원본 샘플에 대한 왜곡을 고려하지 않기 때문에 기존방법보다 생성속도가 빠른 장점이 있다. 실험데이터로는 MNIST와 CIFAR10를 사용하였으며 머신러닝 라이브러리로 Tensorflow를 사용하였다. 실험결과에서, 제안한 오인식 샘플은 기존방법에 비해서 MNIST와 CIFAR10에서 각각 50%, 80% 감소된 반복횟수이면서 100% 공격률을 가진다.

회전분리망 흡착선별기의 순환 굵은골재 이물질 제거효율에 관한 연구 (A Study on Aggregate Waste Separation Efficiency Using Adsorption System with Rotating Separation Net)

  • 조성광;김규용;김경욱;선상원;박진영
    • 한국건설순환자원학회논문집
    • /
    • 제9권1호
    • /
    • pp.85-91
    • /
    • 2021
  • 건설폐기물에서 발생하는 순환골재의 분류 과정에서 발생하는 이물질을 순환골재 출하 전에 회수하기 위하여 회전분리망 흡착 선별기를 설계 및 제작하였다. 제작된 선별기의 성능을 평가하기 위하여, 순환골재에서 자체적으로 회수한 이물질 종류에 따라, 아크릴을 사용하여 규격화된 이물질 샘플을 제작하고, 선별기에서 작동하는 흡입팬의 제어주파수 및 분리망의 흡입구 위치에 따른 선별효율과 순환골재의 오분류율을 평가하여 적절한 운전점을 평가하였다. 순환골재 및 이물질을 입자로 가정한 유동해석을 통해 예측된 선별기의 운전점에서의 분류효율을 평가하였다. 성능 시험 결과 컨베이어 밸트와 흡입구의 거리가 0.2m일 때 95%의 선별효율을 보이는 것으로 나타났으나, 순환골재의 오분류율이 2% 이상으로 선별효율과 2% 이하의 오분류율을 만족하는 운전점은 흡입구 거리 0.254m에서 제어주파수 58Hz으로 나타났다. 유동해석 결과 이물질 선별기에서 순환골재의 오분류는 나타나지 않았다. 기존 순환골재 생산공정에서 이물질 저감을 위해 설치식으로 운용이 가능한 회전분리망을 이용한 풍력 선별시스템을 구축하였다.

Bootstrap confidence intervals for classification error rate in circular models when a block of observations is missing

  • Chung, Hie-Choon;Han, Chien-Pai
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권4호
    • /
    • pp.757-764
    • /
    • 2009
  • In discriminant analysis, we consider a special pattern which contains a block of missing observations. We assume that the two populations are equally likely and the costs of misclassification are equal. In this situation, we consider the bootstrap confidence intervals of the error rate in the circular models when the covariance matrices are equal and not equal.

  • PDF

Bootstrap Confidence Intervals of Classification Error Rate for a Block of Missing Observations

  • Chung, Hie-Choon
    • Communications for Statistical Applications and Methods
    • /
    • 제16권4호
    • /
    • pp.675-686
    • /
    • 2009
  • In this paper, it will be assumed that there are two distinct populations which are multivariate normal with equal covariance matrix. We also assume that the two populations are equally likely and the costs of misclassification are equal. The classification rule depends on the situation when the training samples include missing values or not. We consider the bootstrap confidence intervals for classification error rate when a block of observation is missing.

유전자 알고리즘을 활용한 데이터 불균형 해소 기법의 조합적 활용

  • 장영식;김종우;허준
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2007년도 한국지능정보시스템학회
    • /
    • pp.309-320
    • /
    • 2007
  • The data imbalance problem which can be uncounted in data mining classification problems typically means that there are more or less instances in a class than those in other classes. It causes low prediction accuracy of the minority class because classifiers tend to assign instances to major classes and ignore the minor class to reduce overall misclassification rate. In order to solve the data imbalance problem, there has been proposed a number of techniques based on resampling with replacement, adjusting decision thresholds, and adjusting the cost of the different classes. In this paper, we study the feasibility of the combination usage of the techniques previously proposed to deal with the data imbalance problem, and suggest a combination method using genetic algorithm to find the optimal combination ratio of the techniques. To improve the prediction accuracy of a minority class, we determine the combination ratio based on the F-value of the minority class as the fitness function of genetic algorithm. To compare the performance with those of single techniques and the matrix-style combination of random percentage, we performed experiments using four public datasets which has been generally used to compare the performance of methods for the data imbalance problem. From the results of experiments, we can find the usefulness of the proposed method.

  • PDF

A Recursive Partitioning Rule for Binary Decision Trees

  • Kim, Sang-Guin
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.471-478
    • /
    • 2003
  • In this paper, we reconsider the Kolmogorov-Smirnoff distance as a split criterion for binary decision trees and suggest an algorithm to obtain the Kolmogorov-Smirnoff distance more efficiently when the input variable have more than three categories. The Kolmogorov-Smirnoff distance is shown to have the property of exclusive preference. Empirical results, comparing the Kolmogorov-Smirnoff distance to the Gini index, show that the Kolmogorov-Smirnoff distance grows more accurate trees in terms of misclassification rate.

PfSGA를 이용한 MLP 분류기의 구조 학습 (A Structural Learning of MLP Classifiers Using PfSGA)

  • 愼晟孝;金 商雲
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 추계종합학술대회 논문집
    • /
    • pp.1277-1280
    • /
    • 1998
  • We propose a structural learning method of MLP classifiers for a given application using PfSGA (parameter-free species genetic algorithm), which is a combining of species genetic algorithm(SGA) and parameter-free genetic algorithm(PfGA). experimental results show that PfSGA can reduce the learing time of SGA and has no influence of parameter values on structural learning. And we also convince that PfSGA is more efficient than the other methods in the aspect of misclassification ratio, learning rate, and complexity of MLP structure.

  • PDF

GAVQ를 이용한 음성인식에 관한 연구 (A Study on Speech Recognition using GAVQ(Genetic Algorithms Vector Quantization))

  • 이상희;이재곤;정호균;김용연;남재성
    • 산업기술연구
    • /
    • 제19권
    • /
    • pp.209-216
    • /
    • 1999
  • In this paper, we proposed a modofied genetic algorithm to minimize misclassification rate for determining the codebook. Genetic algorithms are adaptive methods which may be used solve search and optimization problems based on the genetic processes of biological organisms. But they generally require a large amount of computation efforts. GAVQ can choose the optimal individuals by genetic operators. The position of individuals are optimized to improve the recognition rate. The technical properties of this study is that prevents us from the local minimum problem, which is not avoidable by conventional VQ algorithms. We compared the simulation result with Matlab using phoneme data. The simulation results show that the recognition rate from GAVQ is improved by comparing the conventional VQ algorithms.

  • PDF