• Title/Summary/Keyword: 다중 클래스 분류

Search Result 137, Processing Time 0.033 seconds

Integrating Multiple Classifiers in a GA-based Inductive Learning Environment (유전 알고리즘 기반 귀납적 학습 환경에서 분류기의 통합)

  • Kim, Yeong-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.3
    • /
    • pp.614-621
    • /
    • 2006
  • We have implemented a multiclassifier learning approach in a GA-based inductive learning environment that learns classification rules that are similar to rules used in PROSPECTOR. In the multiclassifier learning approach, a classification system is constructed with several classifiers that are obtained by running a GA-based learning system several times to improve the overall performance of a classification system. To implement the multiclassifier learning approach, we need a decision-making scheme that can draw a decision using multiple classifiers. In this paper, we introduce two decision-making schemes: one is based on combining posterior odds given by classifiers to each class and the other one is a voting scheme based on ranking assigned to each class by classifiers. We also present empirical results that evaluate the effect of the multiclassifier learning approach on the GA-based inductive teaming environment.

Emotion Recognition Method Using FLD and Staged Classification Based on Profile Data (프로파일기반의 FLD와 단계적 분류를 이용한 감성 인식 기법)

  • Kim, Jae-Hyup;Oh, Na-Rae;Jun, Gab-Song;Moon, Young-Shik
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.6
    • /
    • pp.35-46
    • /
    • 2011
  • In this paper, we proposed the method of emotion recognition using staged classification model and Fisher's linear discriminant. By organizing the staged classification model, the proposed method improves the classification rate on the Fisher's feature space with high complexity. The staged classification model is achieved by the successive combining of binary classification model which has simple structure and high performance. On each stage, it forms Fisher's linear discriminant according to the two groups which contain each emotion class, and generates the binary classification model by using Adaboost method on the Fisher's space. Whole learning process is repeatedly performed until all the separations of emotion classes are finished. In experimental results, the proposed method provides about 72% classification rate on 8 classes of emotion and about 93% classification rate on specific 3 classes of emotion.

K-means Support Vector Data Description concerning Negative data (Negative data를 고려한 K-means Support Vector Data Description)

  • Song, Dong-Sung;Kim, Pyo-Jae;Chang, Hyung-Jin;Choi, Jin-Young
    • Proceedings of the KIEE Conference
    • /
    • 2007.04a
    • /
    • pp.310-312
    • /
    • 2007
  • SVDD는 one-class 분류기법이지만, 다중 클래스 분류에도 적용될 수 있다. 이 때 타 클래스의 data가 고려 대상 클래스의 학습된 경계안에 들어오지 않도록 하기 위하여 negative data를 고려한 SVDD방법이 사용되어 왔다. 그러나 이 방법은, 고려해야 하는 데이터 수가 늘어남에 따라 학습에 소요되는 시간이 증가하게 되는 문제점을 가지고 있다. 본 논문에서는 negative data를 고려한 학습 시, SVDD대신 KMSVDD를 사용하고 negative data일 가능성이 없는 영역에 놓인 데이터를 제외하는 기법을 사용함으로써 학습시간의 증가를 완화하는 방법을 제안하고자 한다. 이를 통해서 대상 클래스에 속하지 않은 모든 data를 negative data로 고려하여 학습을 진행할 때 보다 빠른 시간에 유사한 결과를 얻을 수 있다. 몇 가지 모의실험을 통하여 그 효과를 검증하도록 한다.

  • PDF

Traffic Anomaly Identification Using Multi-Class Support Vector Machine (다중 클래스 SVM을 이용한 트래픽의 이상패턴 검출)

  • Park, Young-Jae;Kim, Gye-Young;Jang, Seok-Woo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.4
    • /
    • pp.1942-1950
    • /
    • 2013
  • This paper suggests a new method of detecting attacks of network traffic by visualizing original traffic data and applying multi-class SVM (support vector machine). The proposed method first generates 2D images from IP and ports of transmitters and receivers, and extracts linear patterns and high intensity values from the images, representing traffic attacks. It then obtains variance of ports of transmitters and receivers and extracts the number of clusters and entropy features using ISODATA algorithm. Finally, it determines through multi-class SVM if the traffic data contain DDoS, DoS, Internet worm, or port scans. Experimental results show that the suggested multi-class SVM-based algorithm can more effectively detect network traffic attacks.

Medical Image Classification and Keyword Annotation Using Combination of Random Forests and Relation Weight (Random Forests와 관계 가중치 결합을 이용한 의료 영상 분류 및 주석 자동 생성)

  • Lee, Ji-hyun;Kim, Seong-hoon;Ko, Byoung-chul;Nam, Jae-Yeal
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.596-598
    • /
    • 2010
  • 본 논문에서는 의료영상 중 X-ray 영상을 대상으로 영상을 분류하고 분류 결과에 따라 다중 키워드를 생성하는 방법을 제시한다. X-ray영상은 대부분 그레이 영상임으로 Local Binary Patterns (LBP)을 이용하여 픽셀간의 연관성을 특징으로 추출하고, 실시간 학습 및 분류가 가능한 Random Forests 분류기로 영상들을 30개의 클래스로 분류한다. 또한, 미리 정의된 신체 부위간의 관계 가중치를 분류 스코어에 결합하여 신뢰값을 생성하고 이를 기반으로 영상에 대해 다중 주석을 부여하게 된다. 이렇게 부여된 다중 주석은 키워드 기반의 의료영상을 가능케 함으로 보다 쉽고 효율적인 검색 환경을 제공할 수 있다.

Object-based Image Classification by Integrating Multiple Classes in Hue Channel Images (Hue 채널 영상의 다중 클래스 결합을 이용한 객체 기반 영상 분류)

  • Ye, Chul-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.6_3
    • /
    • pp.2011-2025
    • /
    • 2021
  • In high-resolution satellite image classification, when the color values of pixels belonging to one class are different, such as buildings with various colors, it is difficult to determine the color information representing the class. In this paper, to solve the problem of determining the representative color information of a class, we propose a method to divide the color channel of HSV (Hue Saturation Value) and perform object-based classification. To this end, after transforming the input image of the RGB color space into the components of the HSV color space, the Hue component is divided into subchannels at regular intervals. The minimum distance-based image classification is performed for each hue subchannel, and the classification result is combined with the image segmentation result. As a result of applying the proposed method to KOMPSAT-3A imagery, the overall accuracy was 84.97% and the kappa coefficient was 77.56%, and the classification accuracy was improved by more than 10% compared to a commercial software.

Word Sense Classification Using Support Vector Machines (지지벡터기계를 이용한 단어 의미 분류)

  • Park, Jun Hyeok;Lee, Songwook
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.11
    • /
    • pp.563-568
    • /
    • 2016
  • The word sense disambiguation problem is to find the correct sense of an ambiguous word having multiple senses in a dictionary in a sentence. We regard this problem as a multi-class classification problem and classify the ambiguous word by using Support Vector Machines. Context words of the ambiguous word, which are extracted from Sejong sense tagged corpus, are represented to two kinds of vector space. One vector space is composed of context words vectors having binary weights. The other vector space has vectors where the context words are mapped by word embedding model. After experiments, we acquired accuracy of 87.0% with context word vectors and 86.0% with word embedding model.

Diagnosis of Valve Internal Leakage for Ship Piping System using Acoustic Emission Signal-based Machine Learning Approach (선박용 밸브의 내부 누설 진단을 위한 음향방출신호의 머신러닝 기법 적용 연구)

  • Lee, Jung-Hyung
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.28 no.1
    • /
    • pp.184-192
    • /
    • 2022
  • Valve internal leakage is caused by damage to the internal parts of the valve, resulting in accidents and shutdowns of the piping system. This study investigated the possibility of a real-time leak detection method using the acoustic emission (AE) signal generated from the piping system during the internal leakage of a butterfly valve. Datasets of raw time-domain AE signals were collected and postprocessed for each operation mode of the valve in a systematic manner to develop a data-driven model for the detection and classification of internal leakage, by applying machine learning algorithms. The aim of this study was to determine whether it is possible to treat leak detection as a classification problem by applying two classification algorithms: support vector machine (SVM) and convolutional neural network (CNN). The results showed different performances for the algorithms and datasets used. The SVM-based binary classification models, based on feature extraction of data, achieved an overall accuracy of 83% to 90%, while in the case of a multiple classification model, the accuracy was reduced to 66%. By contrast, the CNN-based classification model achieved an accuracy of 99.85%, which is superior to those of any other models based on the SVM algorithm. The results revealed that the SVM classification model requires effective feature extraction of the AE signals to improve the accuracy of multi-class classification. Moreover, the CNN-based classification can be a promising approach to detect both leakage and valve opening as long as the performance of the processor does not degrade.

Label Embedding for Improving Classification Accuracy UsingAutoEncoderwithSkip-Connections (다중 레이블 분류의 정확도 향상을 위한 스킵 연결 오토인코더 기반 레이블 임베딩 방법론)

  • Kim, Museong;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.175-197
    • /
    • 2021
  • Recently, with the development of deep learning technology, research on unstructured data analysis is being actively conducted, and it is showing remarkable results in various fields such as classification, summary, and generation. Among various text analysis fields, text classification is the most widely used technology in academia and industry. Text classification includes binary class classification with one label among two classes, multi-class classification with one label among several classes, and multi-label classification with multiple labels among several classes. In particular, multi-label classification requires a different training method from binary class classification and multi-class classification because of the characteristic of having multiple labels. In addition, since the number of labels to be predicted increases as the number of labels and classes increases, there is a limitation in that performance improvement is difficult due to an increase in prediction difficulty. To overcome these limitations, (i) compressing the initially given high-dimensional label space into a low-dimensional latent label space, (ii) after performing training to predict the compressed label, (iii) restoring the predicted label to the high-dimensional original label space, research on label embedding is being actively conducted. Typical label embedding techniques include Principal Label Space Transformation (PLST), Multi-Label Classification via Boolean Matrix Decomposition (MLC-BMaD), and Bayesian Multi-Label Compressed Sensing (BML-CS). However, since these techniques consider only the linear relationship between labels or compress the labels by random transformation, it is difficult to understand the non-linear relationship between labels, so there is a limitation in that it is not possible to create a latent label space sufficiently containing the information of the original label. Recently, there have been increasing attempts to improve performance by applying deep learning technology to label embedding. Label embedding using an autoencoder, a deep learning model that is effective for data compression and restoration, is representative. However, the traditional autoencoder-based label embedding has a limitation in that a large amount of information loss occurs when compressing a high-dimensional label space having a myriad of classes into a low-dimensional latent label space. This can be found in the gradient loss problem that occurs in the backpropagation process of learning. To solve this problem, skip connection was devised, and by adding the input of the layer to the output to prevent gradient loss during backpropagation, efficient learning is possible even when the layer is deep. Skip connection is mainly used for image feature extraction in convolutional neural networks, but studies using skip connection in autoencoder or label embedding process are still lacking. Therefore, in this study, we propose an autoencoder-based label embedding methodology in which skip connections are added to each of the encoder and decoder to form a low-dimensional latent label space that reflects the information of the high-dimensional label space well. In addition, the proposed methodology was applied to actual paper keywords to derive the high-dimensional keyword label space and the low-dimensional latent label space. Using this, we conducted an experiment to predict the compressed keyword vector existing in the latent label space from the paper abstract and to evaluate the multi-label classification by restoring the predicted keyword vector back to the original label space. As a result, the accuracy, precision, recall, and F1 score used as performance indicators showed far superior performance in multi-label classification based on the proposed methodology compared to traditional multi-label classification methods. This can be seen that the low-dimensional latent label space derived through the proposed methodology well reflected the information of the high-dimensional label space, which ultimately led to the improvement of the performance of the multi-label classification itself. In addition, the utility of the proposed methodology was identified by comparing the performance of the proposed methodology according to the domain characteristics and the number of dimensions of the latent label space.

Feature Extraction and Classification of Multi-temporal SAR Data Using 3D Wavelet Transform (3차원 웨이블렛 변환을 이용한 다중시기 SAR 영상의 특징 추출 및 분류)

  • Yoo, Hee Young;Park, No-Wook;Hong, Sukyoung;Lee, Kyungdo;Kim, Yihyun
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.5
    • /
    • pp.569-579
    • /
    • 2013
  • In this study, land-cover classification was implemented using features extracted from multi-temporal SAR data through 3D wavelet transform and the applicability of the 3D wavelet transform as a feature extraction approach was evaluated. The feature extraction stage based on 3D wavelet transform was first carried out before the classification and the extracted features were used as input for land-cover classification. For a comparison purpose, original image data without the feature extraction stage and Principal Component Analysis (PCA) based features were also classified. Multi-temporal Radarsat-1 data acquired at Dangjin, Korea was used for this experiment and five land-cover classes including paddy fields, dry fields, forest, water, and built up areas were considered for classification. According to the discrimination capability analysis, the characteristics of dry field and forest were similar, so it was very difficult to distinguish these two classes. When using wavelet-based features, classification accuracy was generally improved except built-up class. Especially the improvement of accuracy for dry field and forest classes was achieved. This improvement may be attributed to the wavelet transform procedure decomposing multi-temporal data not only temporally but also spatially. This experiment result shows that 3D wavelet transform would be an effective tool for feature extraction from multi-temporal data although this procedure should be tested to other sensors or other areas through extensive experiments.