• Title/Summary/Keyword: feature vector classification

Search Result 535, Processing Time 0.028 seconds

Improving the Performance of a Fast Text Classifier with Document-side Feature Selection (문서측 자질선정을 이용한 고속 문서분류기의 성능향상에 관한 연구)

  • Lee, Jae-Yun
    • Journal of Information Management
    • /
    • v.36 no.4
    • /
    • pp.51-69
    • /
    • 2005
  • High-speed classification method becomes an important research issue in text categorization systems. A fast text categorization technique, named feature value voting, is introduced recently on the text categorization problems. But the classification accuracy of this technique is not good as its classification speed. We present a novel approach for feature selection, named document-side feature selection, and apply it to feature value voting method. In this approach, there is no feature selection process in learning phase; but realtime feature selection is executed in classification phase. Our results show that feature value voting with document-side feature selection can allow fast and accurate text classification system, which seems to be competitive in classification performance with Support Vector Machines, the state-of-the-art text categorization algorithms.

Band Selection Using Forward Feature Selection Algorithm for Citrus Huanglongbing Disease Detection

  • Katti, Anurag R.;Lee, W.S.;Ehsani, R.;Yang, C.
    • Journal of Biosystems Engineering
    • /
    • v.40 no.4
    • /
    • pp.417-427
    • /
    • 2015
  • Purpose: This study investigated different band selection methods to classify spectrally similar data - obtained from aerial images of healthy citrus canopies and citrus greening disease (Huanglongbing or HLB) infected canopies - using small differences without unmixing endmember components and therefore without the need for an endmember library. However, large number of hyperspectral bands has high redundancy which had to be reduced through band selection. The objective, therefore, was to first select the best set of bands and then detect citrus Huanglongbing infected canopies using these bands in aerial hyperspectral images. Methods: The forward feature selection algorithm (FFSA) was chosen for band selection. The selected bands were used for identifying HLB infected pixels using various classifiers such as K nearest neighbor (KNN), support vector machine (SVM), naïve Bayesian classifier (NBC), and generalized local discriminant bases (LDB). All bands were also utilized to compare results. Results: It was determined that a few well-chosen bands yielded much better results than when all bands were chosen, and brought the classification results on par with standard hyperspectral classification techniques such as spectral angle mapper (SAM) and mixture tuned matched filtering (MTMF). Median detection accuracies ranged from 66-80%, which showed great potential toward rapid detection of the disease. Conclusions: Among the methods investigated, a support vector machine classifier combined with the forward feature selection algorithm yielded the best results.

Pedestrian recognition using differential Haar-like feature based on Adaboost algorithm to apply intelligence wheelchair (지능형 휠체어 적용을 위해 Haar-like의 기울기 특징을 이용한 아다부스트 알고리즘 기반의 보행자 인식)

  • Lee, Sang-Hun;Park, Sang-Hee;Lee, Yeung-Hak;Seo, Hee-Don
    • Journal of Biomedical Engineering Research
    • /
    • v.31 no.6
    • /
    • pp.481-486
    • /
    • 2010
  • In this paper, we suggest an advanced algorithm, to recognize pedestrian/non-pedestrian using differential haar-like feature, which applies Adaboost algorithm to make a strong classification from weak classifications. First, we extract two feature vectors: horizontal haar-like feature and vertical haar-like feature. For the next, we calculate the proposed feature vector using differential haar-like method. And then, a strong classification needs to be obtained from weak classifications for composite recognition method using the differential area of horizontal and vertical haar-like. In the proposed method, we use one feature vector and one strong classification for the first stage of recognition. Based on our experiment, the proposed algorithm shows higher recognition rate compared to the traditional method for the pedestrian and non-pedestrian.

Classification of TV Program Scenes Based on Audio Information

  • Lee, Kang-Kyu;Yoon, Won-Jung;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3E
    • /
    • pp.91-97
    • /
    • 2004
  • In this paper, we propose a classification system of TV program scenes based on audio information. The system classifies the video scene into six categories of commercials, basketball games, football games, news reports, weather forecasts and music videos. Two type of audio feature set are extracted from each audio frame-timbral features and coefficient domain features which result in 58-dimensional feature vector. In order to reduce the computational complexity of the system, 58-dimensional feature set is further optimized to yield l0-dimensional features through Sequential Forward Selection (SFS) method. This down-sized feature set is finally used to train and classify the given TV program scenes using κ -NN, Gaussian pattern matching algorithm. The classification result of 91.6% reported here shows the promising performance of the video scene classification based on the audio information. Finally, the system stability problem corresponding to different query length is investigated.

Texture Classification Using Wavelet-Domain BDIP and BVLC Features With WPCA Classifier (웨이브렛 영역의 BDIP 및 BVLC 특징과 WPCA 분류기를 이용한 질감 분류)

  • Kim, Nam-Chul;Kim, Mi-Hye;So, Hyun-Joo;Jang, Ick-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.49 no.2
    • /
    • pp.102-112
    • /
    • 2012
  • In this paper, we propose a texture classification using wavelet-domain BDIP (block difference of inverse probabilities) and BVLC (block variance of local correlation coefficients) features with WPCA (whitened principal component analysis) classifier. In the proposed method, the wavelet transform is first applied to a query image. The BDIP and BVLC operators are next applied to the wavelet subbands. Global moments for each subband of BDIP and BVLC are then computed and fused into a feature vector. In classification, the WPCA classifier, which is usually adopted in the face identification, searches the training feature vector most similar to the query feature vector. Experimental results show that the proposed method yields excellent texture classification with low feature dimension for test texture image DBs.

Improving the Performance of SVM Text Categorization with Inter-document Similarities (문헌간 유사도를 이용한 SVM 분류기의 문헌분류성능 향상에 관한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.22 no.3 s.57
    • /
    • pp.261-287
    • /
    • 2005
  • The purpose of this paper is to explore the ways to improve the performance of SVM (Support Vector Machines) text classifier using inter-document similarities. SVMs are powerful machine learning systems, which are considered as the state-of-the-art technique for automatic document classification. In this paper text categorization via SVMs approach based on feature representation with document vectors is suggested. In this approach, document vectors instead of index terms are used as features, and vector similarities instead of term weights are used as feature values. Experiments show that SVM classifier with document vector features can improve the document classification performance. For the sake of run-time efficiency, two methods are developed: One is to select document vector features, and the other is to use category centroid vector features instead. Experiments on these two methods show that we can get improved performance with small vector feature set than the performance of conventional methods with index term features.

Medical Image Retrieval based on Multi-class SVM and Correlated Categories Vector

  • Park, Ki-Hee;Ko, Byoung-Chul;Nam, Jae-Yeal
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.8C
    • /
    • pp.772-781
    • /
    • 2009
  • This paper proposes a novel algorithm for the efficient classification and retrieval of medical images. After color and edge features are extracted from medical images, these two feature vectors are then applied to a multi-class Support Vector Machine, to give membership vectors. Thereafter, the two membership vectors are combined into an ensemble feature vector. Also, to reduce the search time, Correlated Categories Vector is proposed for similarity matching. The experimental results show that the proposed system improves the retrieval performance when compared to other methods.

Effective Korean sentiment classification method using word2vec and ensemble classifier (Word2vec과 앙상블 분류기를 사용한 효율적 한국어 감성 분류 방안)

  • Park, Sung Soo;Lee, Kun Chang
    • Journal of Digital Contents Society
    • /
    • v.19 no.1
    • /
    • pp.133-140
    • /
    • 2018
  • Accurate sentiment classification is an important research topic in sentiment analysis. This study suggests an efficient classification method of Korean sentiment using word2vec and ensemble methods which have been recently studied variously. For the 200,000 Korean movie review texts, we generate a POS-based BOW feature and a feature using word2vec, and integrated features of two feature representation. We used a single classifier of Logistic Regression, Decision Tree, Naive Bayes, and Support Vector Machine and an ensemble classifier of Adaptive Boost, Bagging, Gradient Boosting, and Random Forest for sentiment classification. As a result of this study, the integrated feature representation composed of BOW feature including adjective and adverb and word2vec feature showed the highest sentiment classification accuracy. Empirical results show that SVM, a single classifier, has the highest performance but ensemble classifiers show similar or slightly lower performance than the single classifier.

Feature Selection for Multi-Class Support Vector Machines Using an Impurity Measure of Classification Trees: An Application to the Credit Rating of S&P 500 Companies

  • Hong, Tae-Ho;Park, Ji-Young
    • Asia pacific journal of information systems
    • /
    • v.21 no.2
    • /
    • pp.43-58
    • /
    • 2011
  • Support vector machines (SVMs), a machine learning technique, has been applied to not only binary classification problems such as bankruptcy prediction but also multi-class problems such as corporate credit ratings. However, in general, the performance of SVMs can be easily worse than the best alternative model to SVMs according to the selection of predictors, even though SVMs has the distinguishing feature of successfully classifying and predicting in a lot of dichotomous or multi-class problems. For overcoming the weakness of SVMs, this study has proposed an approach for selecting features for multi-class SVMs that utilize the impurity measures of classification trees. For the selection of the input features, we employed the C4.5 and CART algorithms, including the stepwise method of discriminant analysis, which is a well-known method for selecting features. We have built a multi-class SVMs model for credit rating using the above method and presented experimental results with data regarding S&P 500 companies.

Sasang Constitution Classification System by Morphological Feature Extraction of Facial Images

  • Lee, Hye-Lim;Cho, Jin-Soo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.8
    • /
    • pp.15-21
    • /
    • 2015
  • This study proposed a Sasang constitution classification system that can increase the objectivity and reliability of Sasang constitution diagnosis using the image of frontal face, in order to solve problems in the subjective classification of Sasang constitution based on Sasang constitution specialists' experiences. For classification, characteristics indicating the shapes of the eyes, nose, mouth and chin were defined, and such characteristics were extracted using the morphological statistic analysis of face images. Then, Sasang constitution was classified through a SVM (Support Vector Machine) classifier using the extracted characteristics as its input, and according to the results of experiment, the proposed system showed a correct recognition rate of 93.33%. Different from existing systems that designate characteristic points directly, this system showed a high correct recognition rate and therefore it is expected to be useful as a more objective Sasang constitution classification system.