• Title/Summary/Keyword: classifiers

Search Result 743, Processing Time 0.025 seconds

Ensemble learning of Regional Experts (지역 전문가의 앙상블 학습)

  • Lee, Byung-Woo;Yang, Ji-Hoon;Kim, Seon-Ho
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.2
    • /
    • pp.135-139
    • /
    • 2009
  • We present a new ensemble learning method that employs the set of region experts, each of which learns to handle a subset of the training data. We split the training data and generate experts for different regions in the feature space. When classifying a data, we apply a weighted voting among the experts that include the data in their region. We used ten datasets to compare the performance of our new ensemble method with that of single classifiers as well as other ensemble methods such as Bagging and Adaboost. We used SMO, Naive Bayes and C4.5 as base learning algorithms. As a result, we found that the performance of our method is comparable to that of Adaboost and Bagging when the base learner is C4.5. In the remaining cases, our method outperformed the benchmark methods.

Classification of Imbalanced Data Using Multilayer Perceptrons (다층퍼셉트론에 의한 불균현 데이터의 학습 방법)

  • Oh, Sang-Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.7
    • /
    • pp.141-148
    • /
    • 2009
  • Recently there have been many research efforts focused on imbalanced data classification problems, since they are pervasive but hard to be solved. Approaches to the imbalanced data problems can be categorized into data level approach using re-sampling, algorithmic level one using cost functions, and ensembles of basic classifiers for performance improvement. As an algorithmic level approach, this paper proposes to use multilayer perceptrons with higher-order error functions. The error functions intensify the training of minority class patterns and weaken the training of majority class patterns. Mammography and thyroid data-sets are used to verify the superiority of the proposed method over the other methods such as mean-squared error, two-phase, and threshold moving methods.

DeepCleanNet: Training Deep Convolutional Neural Network with Extremely Noisy Labels

  • Olimov, Bekhzod;Kim, Jeonghong
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.11
    • /
    • pp.1349-1360
    • /
    • 2020
  • In recent years, Convolutional Neural Networks (CNNs) have been successfully implemented in different tasks of computer vision. Since CNN models are the representatives of supervised learning algorithms, they demand large amount of data in order to train the classifiers. Thus, obtaining data with correct labels is imperative to attain the state-of-the-art performance of the CNN models. However, labelling datasets is quite tedious and expensive process, therefore real-life datasets often exhibit incorrect labels. Although the issue of poorly labelled datasets has been studied before, we have noticed that the methods are very complex and hard to reproduce. Therefore, in this research work, we propose Deep CleanNet - a considerably simple system that achieves competitive results when compared to the existing methods. We use K-means clustering algorithm for selecting data with correct labels and train the new dataset using a deep CNN model. The technique achieves competitive results in both training and validation stages. We conducted experiments using MNIST database of handwritten digits with 50% corrupted labels and achieved up to 10 and 20% increase in training and validation sets accuracy scores, respectively.

Set Covering-based Feature Selection of Large-scale Omics Data (Set Covering 기반의 대용량 오믹스데이터 특징변수 추출기법)

  • Ma, Zhengyu;Yan, Kedong;Kim, Kwangsoo;Ryoo, Hong Seo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.39 no.4
    • /
    • pp.75-84
    • /
    • 2014
  • In this paper, we dealt with feature selection problem of large-scale and high-dimensional biological data such as omics data. For this problem, most of the previous approaches used simple score function to reduce the number of original variables and selected features from the small number of remained variables. In the case of methods that do not rely on filtering techniques, they do not consider the interactions between the variables, or generate approximate solutions to the simplified problem. Unlike them, by combining set covering and clustering techniques, we developed a new method that could deal with total number of variables and consider the combinatorial effects of variables for selecting good features. To demonstrate the efficacy and effectiveness of the method, we downloaded gene expression datasets from TCGA (The Cancer Genome Atlas) and compared our method with other algorithms including WEKA embeded feature selection algorithms. In the experimental results, we showed that our method could select high quality features for constructing more accurate classifiers than other feature selection algorithms.

A Study on the Music Schedules in the 4th Edition of KDC (한국십진분류법 제4판 음악분야 전개상의 제문제)

  • Hahn Kyung-Shin
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.1
    • /
    • pp.31-60
    • /
    • 1999
  • The purpose of this study is to investigate the problems concerning music schedules of KDC. The object is especially arrangement of 670 music in the 4th edition of KDC. In this study, therefore, the development of 670 music division from the 1st edition to the 4th edition of KDC were examined first as the backgrounds. Then the expansion aspects and their problems of 670 music division in the 4th edition of KDC were analyzed. And based on the findings, some suggestions to solve the problems were proposed. These problems of music division of KDC originate in the lack of professional understanding of music, structural problem of KDC itself, and very comprehensive and contradictory revision policy. To make the KDC improved as standard classification scheme of Korea, mutual cooperation of the classifiers and specialists in music is inevitable.

  • PDF

Variational Auto-Encoder Based Semi-supervised Learning Scheme for Learner Classification in Intelligent Tutoring System (지능형 교육 시스템의 학습자 분류를 위한 Variational Auto-Encoder 기반 준지도학습 기법)

  • Jung, Seungwon;Son, Minjae;Hwang, Eenjun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1251-1258
    • /
    • 2019
  • Intelligent tutoring system enables users to effectively learn by utilizing various artificial intelligence techniques. For instance, it can recommend a proper curriculum or learning method to individual users based on their learning history. To do this effectively, user's characteristics need to be analyzed and classified based on various aspects such as interest, learning ability, and personality. Even though data labeled by the characteristics are required for more accurate classification, it is not easy to acquire enough amount of labeled data due to the labeling cost. On the other hand, unlabeled data should not need labeling process to make a large number of unlabeled data be collected and utilized. In this paper, we propose a semi-supervised learning method based on feedback variational auto-encoder(FVAE), which uses both labeled data and unlabeled data. FVAE is a variation of variational auto-encoder(VAE), where a multi-layer perceptron is added for giving feedback. Using unlabeled data, we train FVAE and fetch the encoder of FVAE. And then, we extract features from labeled data by using the encoder and train classifiers with the extracted features. In the experiments, we proved that FVAE-based semi-supervised learning was superior to VAE-based method in terms with accuracy and F1 score.

Supervised Classification Systems for High Resolution Satellite Images (고해상도 위성영상을 위한 감독분류 시스템)

  • 전영준;김진일
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.3
    • /
    • pp.301-310
    • /
    • 2003
  • In this paper, we design and Implement the supervised classification systems for high resolution satellite images. The systems support various interfaces and statistical data of training samples so that we can select the m()st effective training data. In addition, the efficient extension of new classification algorithms and satellite image formats are applied easily through the modularized systems. The classifiers are considered the characteristics of spectral bands from the selected training data. They provide various supervised classification algorithms which include Parallelepiped, Minimum distance, Mahalanobis distance, Maximum likelihood and Fuzzy theory. We used IKONOS images for the input and verified the systems for the classification of high resolution satellite images.

Land Cover Classification of Image Data Using Artificial Neural Networks (인공신경망 모형을 이용한 영상자료의 토지피복분류)

  • Kang, Moon-Seong;Park, Seung-Woo;Kwang, Sik-Yoon
    • Journal of Korean Society of Rural Planning
    • /
    • v.12 no.1 s.30
    • /
    • pp.75-83
    • /
    • 2006
  • 본 연구에서는 최대우도법과 인공신경망 모형에 의해 카테고리 분류를 수행하고 각각의 분류 성능을 비교 평가하였다. 인공신경망 모형은 오류역전파 알고리즘을 이용한 것으로서 학습을 통한 은닉층의 최적노드수를 결정하여 카테고리 분류를 수행하도록 하였다. 인공신경망 최적 모형은 입력층의 노드수가 7개, 은닉층의 최적노드수가 18개, 그리고 출력층의 노드수가 5개인 것으로 구성하였다. 위성영상은 1996년에 촬영된 Landsat TM-5 영상을 사용하였고, 최대우도법과 인공신경망 모형에 의한 카테고리 분류를 위하여 각각의 카테고리에 대한 분광특성을 대표하는 지역을 절취하였다. 분류 정확도는 인공신경망 모형에 의한 방법이 90%, 최대우도법이 83%로서, 인공신경망 모형의 분류 성능이 뛰어난 것으로 나타났다. 카테고리 분류 항목인 토지 피복 상태에 따른 분류는 두 가지 방법에서 밭과 주거지의 분류오차가 큰 것으로 나타났다. 특히, 최대우도법에 의한 밭에서의 태만오차는 62.6%로서 매우 큰 값을 보였다. 이는 밭이나 주거지의 특성이 위성영상 촬영시기에 따라 나지의 형태로 분류되거나 산림, 또는 논으로도 분류되는 경향이 있기 때문인 것으로 보인다. 차후에 카테고리 분류를 위한 각각의 클래스의 보조적인 정보를 추가한다면, 카테고리 분류 향상이 이루어질 것으로 기대된다.

Hybrid Neural Classifier Combined with H-ART2 and F-LVQ for Face Recognition

  • Kim, Do-Hyeon;Cha, Eui-Young;Kim, Kwang-Baek
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.1287-1292
    • /
    • 2005
  • This paper presents an effective pattern classification model by designing an artificial neural network based pattern classifiers for face recognition. First, a RGB image inputted from a frame grabber is converted into a HSV image which is similar to the human beings' vision system. Then, the coarse facial region is extracted using the hue(H) and saturation(S) components except intensity(V) component which is sensitive to the environmental illumination. Next, the fine facial region extraction process is performed by matching with the edge and gray based templates. To make a light-invariant and qualified facial image, histogram equalization and intensity compensation processing using illumination plane are performed. The finally extracted and enhanced facial images are used for training the pattern classification models. The proposed H-ART2 model which has the hierarchical ART2 layers and F-LVQ model which is optimized by fuzzy membership make it possible to classify facial patterns by optimizing relations of clusters and searching clustered reference patterns effectively. Experimental results show that the proposed face recognition system is as good as the SVM model which is famous for face recognition field in recognition rate and even better in classification speed. Moreover high recognition rate could be acquired by combining the proposed neural classification models.

  • PDF

A Neural Net Classifier for Hangeul Recognition (한글 인식을 위한 신경망 분류기의 응용)

  • 최원호;최동혁;이병래;박규태
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.8
    • /
    • pp.1239-1249
    • /
    • 1990
  • In this paper, using the neural network design techniques, an adaptive Mahalanobis distance classifier(AMDC) is designed. This classifier has three layers: input layer, internal layer and output layer. The connection from input layer to internal layer is fully connected, and that from internal to output layer has partial connection that might be thought as an Oring. If two ormore clusters of patterns of one class are laid apart in the feature space, the network adaptively generate the internal nodes, whhch are corresponding to the subclusters of that class. The number of the output nodes in just same as the number of the classes to classify, on the other hand, the number of the internal nodes is defined by the number of the subclusters, and can be optimized by itself. Using the method of making the subclasses, the different patterns that are of the same class can easily be distinguished from other classes. If additional training is needed after the completion of the traning, the AMDC does not have to repeat the trainging that has already done. To test the performance of the AMDC, the experiments of classifying 500 Hangeuls were done. In experiment, 20 print font sets of Hangeul characters(10,000 cahracters) were used for training, and with 3 sets(1,500 characters), the AMDC was tested for various initial variance \ulcornerand threshold \ulcorner and compared with other statistical or neural classifiers.

  • PDF