• 제목/요약/키워드: Statistical classification

검색결과 1,415건 처리시간 0.03초

WHEN CAN SUPPORT VECTOR MACHINE ACHIEVE FAST RATES OF CONVERGENCE?

  • Park, Chang-Yi
    • Journal of the Korean Statistical Society
    • /
    • 제36권3호
    • /
    • pp.367-372
    • /
    • 2007
  • Classification as a tool to extract information from data plays an important role in science and engineering. Among various classification methodologies, support vector machine has recently seen significant developments. The central problem this paper addresses is the accuracy of support vector machine. In particular, we are interested in the situations where fast rates of convergence to the Bayes risk can be achieved by support vector machine. Through learning examples, we illustrate that support vector machine may yield fast rates if the space spanned by an adopted kernel is sufficiently large.

Tree-structured Classification based on Variable Splitting

  • Ahn, Sung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제2권1호
    • /
    • pp.74-88
    • /
    • 1995
  • This article introduces a unified method of choosing the most explanatory and significant multiway partitions for classification tree design and analysis. The method is derived on the impurity reduction (IR) measure of divergence, which is proposed to extend the proportional-reduction-in-error (PRE) measure in the decision-theory context. For the method derivation, the IR measure is analyzed to characterize its statistical properties which are used to consistently handle the subjects of feature formation, feature selection, and feature deletion required in the associated classification tree construction. A numerical example is considered to illustrate the proposed approach.

  • PDF

Statistical bioinformatics for gene expression data

  • Lee, Jae-K.
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2001년도 제2회 생물정보학 국제심포지엄
    • /
    • pp.103-127
    • /
    • 2001
  • Gene expression studies require statistical experimental designs and validation before laboratory confirmation. Various clustering approaches, such as hierarchical, Kmeans, SOM are commonly used for unsupervised learning in gene expression data. Several classification methods, such as gene voting, SVM, or discriminant analysis are used for supervised lerning, where well-defined response classification is possible. Estimating gene-condition interaction effects require advanced, computationally-intensive statistical approaches.

  • PDF

통계적 정보기반 계층적 퍼지-러프 분류기법 (Statistical Information-Based Hierarchical Fuzzy-Rough Classification Approach)

  • 손창식;서석태;정환묵;권순학
    • 한국지능시스템학회논문지
    • /
    • 제17권6호
    • /
    • pp.792-798
    • /
    • 2007
  • 본 논문에서는 학습기법을 사용하지 않고 패턴분류의 성능을 최대화하면서 규칙의 수를 줄일 수 있는 통계적 정보기반 계층적 퍼지-러프 분류방법을 제안한다. 제안된 방법에서 통계적 정보는 계층적 퍼지-러프 분류 시스템에서 각 계층의 입력부 퍼지집합의 분할 구간을 추출하기 위해서 사용되었고, 러프집합은 통계적 정보로부터 추출된 분할 구간들과 연관된 퍼지 if-then 규칙의 수를 최소화하기 위해서 사용되었다. 제안된 방법의 효과성을 보이기 위해 Fisher의 IRIS 데이터를 사용한 기존 패턴분류 방법의 분류 정확도와 규칙들의 수를 비교하였다. 그 결과, 제안된 방법은 기존 방법들의 분류 성능과 유사함을 확인할 수 있었다.

PD 분류에 있어서 핑거프린트법과 신경망의 비교 (Comparison with Finger Print Method and NN as PD Classification)

  • 박성희;박재열;이강원;강성화;임기조
    • 한국전기전자재료학회:학술대회논문집
    • /
    • 한국전기전자재료학회 2003년도 하계학술대회 논문집 Vol.4 No.2
    • /
    • pp.1163-1167
    • /
    • 2003
  • As a PD classification method, statistical distribution parameters have been used during several ten years. And this parameters are recently finger print method, NN(Neural Network) and etc. So in this paper we studied finger print method and NN with BP(Back propagation) learning algorithm using the statistical distribution parameter, and compared with two method as classification method. As a result of comparison, classification of NN is more good result than Finger print method in respect to calculation speed, visible effect and simplicity. So, NN has more advantage as a tool for PD classification.

  • PDF

New Splitting Criteria for Classification Trees

  • Lee, Yung-Seop
    • Communications for Statistical Applications and Methods
    • /
    • 제8권3호
    • /
    • pp.885-894
    • /
    • 2001
  • Decision tree methods is the one of data mining techniques. Classification trees are used to predict a class label. When a tree grows, the conventional splitting criteria use the weighted average of the left and the right child nodes for measuring the node impurity. In this paper, new splitting criteria for classification trees are proposed which improve the interpretablity of trees comparing to the conventional methods. The criteria search only for interesting subsets of the data, as opposed to modeling all of the data equally well. As a result, the tree is very unbalanced but extremely interpretable.

  • PDF

Evaluating Predictive Ability of Classification Models with Ordered Multiple Categories

  • Oong-Hyun Sung
    • Communications for Statistical Applications and Methods
    • /
    • 제6권2호
    • /
    • pp.383-395
    • /
    • 1999
  • This study is concerned with the evaluation of predictive ability of classification models with ordered multiple categories. If categories can be ordered or ranked the spread of misclassification should be considered to evaluate the performance of the classification models using loss rate since the apparent error rate can not measure the spread of misclassification. Since loss rate is known to underestimate the true loss rate the bootstrap method were used to estimate the true loss rate. thus this study suggests the method to evaluate the predictive power of the classification models using loss rate and the bootstrap estimate of the true loss rate.

  • PDF

Classification Using Sliced Inverse Regression and Sliced Average Variance Estimation

  • Lee, Hakbae
    • Communications for Statistical Applications and Methods
    • /
    • 제11권2호
    • /
    • pp.275-285
    • /
    • 2004
  • We explore classification analysis using graphical methods such as sliced inverse regression and sliced average variance estimation based on dimension reduction. Some useful information about classification analysis are obtained by sliced inverse regression and sliced average variance estimation through dimension reduction. Two examples are illustrated, and classification rates by sliced inverse regression and sliced average variance estimation are compared with those by discriminant analysis and logistic regression.

Bootstrap Confidence Intervals of Classification Error Rate for a Block of Missing Observations

  • Chung, Hie-Choon
    • Communications for Statistical Applications and Methods
    • /
    • 제16권4호
    • /
    • pp.675-686
    • /
    • 2009
  • In this paper, it will be assumed that there are two distinct populations which are multivariate normal with equal covariance matrix. We also assume that the two populations are equally likely and the costs of misclassification are equal. The classification rule depends on the situation when the training samples include missing values or not. We consider the bootstrap confidence intervals for classification error rate when a block of observation is missing.

On Line LS-SVM for Classification

  • Kim, Daehak;Oh, KwangSik;Shim, Jooyong
    • Communications for Statistical Applications and Methods
    • /
    • 제10권2호
    • /
    • pp.595-601
    • /
    • 2003
  • In this paper we propose an on line training method for classification based on least squares support vector machine. Proposed method enables the computation cost to be reduced and the training to be peformed incrementally, With the incremental formulation of an inverse matrix in optimization problem, current information and new input data can be used for building the new inverse matrix for the estimation of the optimal bias and Lagrange multipliers, so the large scale matrix inversion operation can be avoided. Numerical examples are included which indicate the performance of proposed algorithm.