• Title/Summary/Keyword: Classification:

Search Result 22,559, Processing Time 0.039 seconds

A Feature Selection-based Ensemble Method for Arrhythmia Classification

  • Namsrai, Erdenetuya;Munkhdalai, Tsendsuren;Li, Meijing;Shin, Jung-Hoon;Namsrai, Oyun-Erdene;Ryu, Keun Ho
    • Journal of Information Processing Systems
    • /
    • v.9 no.1
    • /
    • pp.31-40
    • /
    • 2013
  • In this paper, a novel method is proposed to build an ensemble of classifiers by using a feature selection schema. The feature selection schema identifies the best feature sets that affect the arrhythmia classification. Firstly, a number of feature subsets are extracted by applying the feature selection schema to the original dataset. Then classification models are built by using the each feature subset. Finally, we combine the classification models by adopting a voting approach to form a classification ensemble. The voting approach in our method involves both classification error rate and feature selection rate to calculate the score of the each classifier in the ensemble. In our method, the feature selection rate depends on the extracting order of the feature subsets. In the experiment, we applied our method to arrhythmia dataset and generated three top disjointed feature sets. We then built three classifiers based on the top-three feature subsets and formed the classifier ensemble by using the voting approach. Our method can improve the classification accuracy in high dimensional dataset. The performance of each classifier and the performance of their ensemble were higher than the performance of the classifier that was based on whole feature space of the dataset. The classification performance was improved and a more stable classification model could be constructed with the proposed approach.

Classification of e-mail Using Dynamic Category Hierarchy and Automatic category generation (자동 카테고리 생성과 동적 분류 체계를 사용한 이메일 분류)

  • Ahn Chan Min;Park Sang Ho;Lee Ju-Hong;Choi Bum-Ghi;Park Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.10 no.2
    • /
    • pp.79-89
    • /
    • 2004
  • Since the amount of E-mail messages has increased , we need a new technique for efficient e-mail classification. E-mail classifications are grouped into two classes: binary classification, multi-classification. The current binary classification methods are mostly spm mail classification methods which are based on rule driven, bayesian, SVM, etc. The current multi- classification methods are based on clustering which groups e-mails by similarity. In this paper, we propose a novel method for e-mail classification. It combines the automatic category generation method based on the vector model and the dynamic category hierarchy construction method. This method can multi-classify e-mail automatically and manage a large amount of e-mail efficiently. In addition, this method increases the search accuracy by dynamic reclassification of e-mails.

  • PDF

Automatic e-mail Hierarchy Classification using Dynamic Category Hierarchy and Principal Component Analysis (PCA와 동적 분류체계를 사용한 자동 이메일 계층 분류)

  • Park, Sun
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.3
    • /
    • pp.419-425
    • /
    • 2009
  • The amount of incoming e-mails is increasing rapidly due to the wide usage of Internet. Therefore, it is more required to classify incoming e-mails efficiently and accurately. Currently, the e-mail classification techniques are focused on two way classification to filter spam mails from normal ones based mainly on Bayesian and Rule. The clustering method has been used for the multi-way classification of e-mails. But it has a disadvantage of low accuracy of classification and no category labels. The classification methods have a disadvantage of training and setting of category labels by user. In this paper, we propose a novel multi-way e-mail hierarchy classification method that uses PCA for automatic category generation and dynamic category hierarchy for high accuracy of classification. It classifies a huge amount of incoming e-mails automatically, efficiently, and accurately.

  • PDF

Accuracy Evaluation of Supervised Classification about IKONOS Imagery using Mixed Pixels (혼합화소를 이용한 IKONOS 영상의 감독분류정확도 평가)

  • Lee, Jong-Sin;Kim, Min-Gyu;Park, Joon-Kyu
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.6
    • /
    • pp.2751-2756
    • /
    • 2012
  • Selection of training set influences the classification accuracy in supervised classification using satellite imagery. Generally, if pure pixels which character of training set is clear were selected, whole accuracy is high while if mixed pixels were selected, accuracy is decreased because of low-resolution imagery or unclear distinguishment. However, it is too difficult to choose the pure pixels as training set actually. Accordingly, this study should be suggested the suitable classification method in case of mixed pixels choice. To achieve this, a few pure pixels were chosen as training set and classification accuracy was calculated which was compared with classification result using an equal number of mixed pixels. As a result, accuracy of SVM was the highest among the classification method using mixed pixels and it was a relatively small difference with the result of classification using pure pixels. Therefore, imagery classification using SVM is most suitable in the mixed area of construction and green because it is high possibility to choose mixed pixels as training set.

A study on the Classification Schemes of Internet Resources for Industry (산업 분야 인터넷 자원의 분류체계에 관한 연구)

  • 한상길
    • Journal of the Korean Society for information Management
    • /
    • v.18 no.3
    • /
    • pp.285-309
    • /
    • 2001
  • The industry information grows faster than any other information resources in the Internet age. Unfortunately, however, there is no consensus on the standard of the classification among the information providers of the industry fields. This may a problematic issue not only in building a continuous and systematic development of the industry information, but also in the use of the information among the users. This study aims to propose a well-structured and/or an efficient classification scheme for the industry information to help the users with easy to retrieve the Internet resources. To do this, we analyzed the subject classification scheme of the domestic industry information on the web sites, which is largely adopted the \"Korean Standard for the Industry Classification\". In addition, we suggested the principle of the subject classification and their hierarchial structure derived from the analysis of the knowledge and document classification scheme. As a result, it was suggested an optimized industry classification scheme based on the analysis of the validity test of classification item measured by the quantitative analysis of the industry information, which it currently accessible through the Internet. Internet.

  • PDF

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.

Learning Networks for Learning the Pattern Vectors causing Classification Error (분류오차유발 패턴벡터 학습을 위한 학습네트워크)

  • Lee Yong-Gu;Choi Woo-Seung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.77-86
    • /
    • 2005
  • In this paper, we designed a learning algorithm of LVQ that extracts classification errors and learns ones and improves classification performance. The proposed LVQ learning algorithm is the learning Networks which is use SOM to learn initial reference vectors and out-star learning algorithm to determine the class of the output neurons of LVQ. To extract pattern vectors which cause classification errors, we proposed the error-cause condition, which uses that condition and constructed the pattern vector space which consists of the input pattern vectors that cause the classification errors and learned these pattern vectors , and improved performance of the pattern classification. To prove the performance of the proposed learning algorithm, the simulation is performed by using training vectors and test vectors that are Fisher' Iris data and EMG data, and classification performance of the proposed learning method is compared with ones of the conventional LVQ, and it was a confirmation that the proposed learning method is more successful classification than the conventional classification.

  • PDF

Emotion Recognition Method Using FLD and Staged Classification Based on Profile Data (프로파일기반의 FLD와 단계적 분류를 이용한 감성 인식 기법)

  • Kim, Jae-Hyup;Oh, Na-Rae;Jun, Gab-Song;Moon, Young-Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.6
    • /
    • pp.35-46
    • /
    • 2011
  • In this paper, we proposed the method of emotion recognition using staged classification model and Fisher's linear discriminant. By organizing the staged classification model, the proposed method improves the classification rate on the Fisher's feature space with high complexity. The staged classification model is achieved by the successive combining of binary classification model which has simple structure and high performance. On each stage, it forms Fisher's linear discriminant according to the two groups which contain each emotion class, and generates the binary classification model by using Adaboost method on the Fisher's space. Whole learning process is repeatedly performed until all the separations of emotion classes are finished. In experimental results, the proposed method provides about 72% classification rate on 8 classes of emotion and about 93% classification rate on specific 3 classes of emotion.

Classification of tree species using high-resolution QuickBird-2 satellite images in the valley of Ui-dong in Bukhansan National Park

  • Choi, Hye-Mi;Yang, Keum-Chul
    • Journal of Ecology and Environment
    • /
    • v.35 no.2
    • /
    • pp.91-98
    • /
    • 2012
  • This study was performed in order to suggest the possibility of tree species classification using high-resolution QuickBird-2 images spectral characteristics comparison(digital numbers [DNs]) of tree species, tree species classification, and accuracy verification. In October 2010, the tree species of three conifers and eight broad-leaved trees were examined in the areas studied. The spectral characteristics of each species were observed, and the study area was classified by image classification. The results were as follows: Panchromatic and multi-spectral band 4 was found to be useful for tree species classification. DNs values of conifers were lower than broad-leaved trees. Vegetation indices such as normalized difference vegetation index (NDVI), soil brightness index (SBI), green vegetation index (GVI) and Biband showed similar patterns to band 4 and panchromatic (PAN); Tukey's multiple comparison test was significant among tree species. However, tree species within the same genus, such as $Pinus$ $densiflora-P.$ $rigida$ and $Quercus$ $mongolica-Q.$ $serrata$, showed similar DNs patterns and, therefore, supervised classification results were difficult to distinguish within the same genus; Random selection of validation pixels showed an overall classification accuracy of 74.1% and Kappa coefficient was 70.6%. The classification accuracy of $Pterocarya$ $stenoptera$, 89.5%, was found to be the highest. The classification accuracy of broad-leaved trees was lower than expected, ranging from 47.9% to 88.9%. $P.$ $densiflora-P.$ $rigida$ and $Q.$ $mongolica-Q.$ $serrata$ were classified as the same species because they did not show significant differences in terms of spectral patterns.

Robust Face Recognition under Limited Training Sample Scenario using Linear Representation

  • Iqbal, Omer;Jadoon, Waqas;ur Rehman, Zia;Khan, Fiaz Gul;Nazir, Babar;Khan, Iftikhar Ahmed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.7
    • /
    • pp.3172-3193
    • /
    • 2018
  • Recently, several studies have shown that linear representation based approaches are very effective and efficient for image classification. One of these linear-representation-based approaches is the Collaborative representation (CR) method. The existing algorithms based on CR have two major problems that degrade their classification performance. First problem arises due to the limited number of available training samples. The large variations, caused by illumintion and expression changes, among query and training samples leads to poor classification performance. Second problem occurs when an image is partially noised (contiguous occlusion), as some part of the given image become corrupt the classification performance also degrades. We aim to extend the collaborative representation framework under limited training samples face recognition problem. Our proposed solution will generate virtual samples and intra-class variations from training data to model the variations effectively between query and training samples. For robust classification, the image patches have been utilized to compute representation to address partial occlusion as it leads to more accurate classification results. The proposed method computes representation based on local regions in the images as opposed to CR, which computes representation based on global solution involving entire images. Furthermore, the proposed solution also integrates the locality structure into CR, using Euclidian distance between the query and training samples. Intuitively, if the query sample can be represented by selecting its nearest neighbours, lie on a same linear subspace then the resulting representation will be more discriminate and accurately classify the query sample. Hence our proposed framework model the limited sample face recognition problem into sufficient training samples problem using virtual samples and intra-class variations, generated from training samples that will result in improved classification accuracy as evident from experimental results. Moreover, it compute representation based on local image patches for robust classification and is expected to greatly increase the classification performance for face recognition task.