• Title/Summary/Keyword: classifiers

Search Result 743, Processing Time 0.031 seconds

Topic Classification for Suicidology

  • Read, Jonathon;Velldal, Erik;Ovrelid, Lilja
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.2
    • /
    • pp.143-150
    • /
    • 2012
  • Computational techniques for topic classification can support qualitative research by automatically applying labels in preparation for qualitative analyses. This paper presents an evaluation of supervised learning techniques applied to one such use case, namely, that of labeling emotions, instructions and information in suicide notes. We train a collection of one-versus-all binary support vector machine classifiers, using cost-sensitive learning to deal with class imbalance. The features investigated range from a simple bag-of-words and n-grams over stems, to information drawn from syntactic dependency analysis and WordNet synonym sets. The experimental results are complemented by an analysis of systematic errors in both the output of our system and the gold-standard annotations.

Improving the Error Back-Propagation Algorithm for Imbalanced Data Sets

  • Oh, Sang-Hoon
    • International Journal of Contents
    • /
    • v.8 no.2
    • /
    • pp.7-12
    • /
    • 2012
  • Imbalanced data sets are difficult to be classified since most classifiers are developed based on the assumption that class distributions are well-balanced. In order to improve the error back-propagation algorithm for the classification of imbalanced data sets, a new error function is proposed. The error function controls weight-updating with regards to the classes in which the training samples are. This has the effect that samples in the minority class have a greater chance to be classified but samples in the majority class have a less chance to be classified. The proposed method is compared with the two-phase, threshold-moving, and target node methods through simulations in a mammography data set and the proposed method attains the best results.

Nonlinear Feature Transformation and Genetic Feature Selection: Improving System Security and Decreasing Computational Cost

  • Taghanaki, Saeid Asgari;Ansari, Mohammad Reza;Dehkordi, Behzad Zamani;Mousavi, Sayed Ali
    • ETRI Journal
    • /
    • v.34 no.6
    • /
    • pp.847-857
    • /
    • 2012
  • Intrusion detection systems (IDSs) have an important effect on system defense and security. Recently, most IDS methods have used transformed features, selected features, or original features. Both feature transformation and feature selection have their advantages. Neighborhood component analysis feature transformation and genetic feature selection (NCAGAFS) is proposed in this research. NCAGAFS is based on soft computing and data mining and uses the advantages of both transformation and selection. This method transforms features via neighborhood component analysis and chooses the best features with a classifier based on a genetic feature selection method. This novel approach is verified using the KDD Cup99 dataset, demonstrating higher performances than other well-known methods under various classifiers have demonstrated.

Missing Value Imputation based on Locally Linear Reconstruction for Improving Classification Performance (분류 성능 향상을 위한 지역적 선형 재구축 기반 결측치 대치)

  • Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.4
    • /
    • pp.276-284
    • /
    • 2012
  • Classification algorithms generally assume that the data is complete. However, missing values are common in real data sets due to various reasons. In this paper, we propose to use locally linear reconstruction (LLR) for missing value imputation to improve the classification performance when missing values exist. We first investigate how much missing values degenerate the classification performance with regard to various missing ratios. Then, we compare the proposed missing value imputation (LLR) with three well-known single imputation methods over three different classifiers using eight data sets. The experimental results showed that (1) any imputation methods, although some of them are very simple, helped to improve the classification accuracy; (2) among the imputation methods, the proposed LLR imputation was the most effective over all missing ratios, and (3) when the missing ratio is relatively high, LLR was outstanding and its classification accuracy was as high as the classification accuracy derived from the compete data set.

Identification of a Gaussian Fuzzy Classifier

  • Heesoo Hwang
    • International Journal of Control, Automation, and Systems
    • /
    • v.2 no.1
    • /
    • pp.118-124
    • /
    • 2004
  • This paper proposes an approach to deriving a fuzzy classifier based on evolutionary supervised clustering, which identifies the optimal clusters necessary to classify classes. The clusters are formed by multi-dimensional weighted Euclidean distance, which allows clusters of varying shapes and sizes. A cluster induces a Gaussian fuzzy antecedent set with unique variance in each dimension, which reflects the tightness of the cluster. The fuzzy classifier is com-posed of as many classification rules as classes. The clusters identified for each class constitute fuzzy sets, which are joined by an "and" connective in the antecedent part of the corresponding rule. The approach is evaluated using six data sets. The comparative results with different classifiers are given.are given.

A New Approach to the Design of Combining Classifier Based on Immune Algorithm

  • Kim, Moon-Hwan;Jeong, Keun-Ho;Joo, Young-Hoon;Park, Jin-Bae
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1272-1277
    • /
    • 2003
  • This paper presents a method for combining classifier which is constructed by fuzzy and neural network classifiers and uses classifier fusion algorithms and selection algorithms. The input space of combing classifier is divided by the extended hyperbox region proposed in this paper to guarantee non-overlapped data property. To fuse the fuzzy classifier and the neural network classifier, we propose the fusion parameter for the overlapped data. In addition, the adaptive learning algorithm also proposed to maximize classifier performance. Finally, simulation examples are given to illustrate the effectiveness of the method.

  • PDF

Classification of Emotional States of Interest and Neutral Using Features from Pulse Wave Signal

  • Phongsuphap, Sukanya;Sopharak, Akara
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2004.08a
    • /
    • pp.682-685
    • /
    • 2004
  • This paper investigated a method for classifying emotional states by using pulse wave signal. It focused on finding effective features for emotional state classification. The emptional states considered here consisted of interest and neutral. Classification experiments utilized 65 and 60 samples of interest and neutral states respectively. We have investigated 19 features derived from pulse wave signals by using both time domain and frequency domain analysis methods with 2 classifiers of minimum distance (normalized Euclidean distanece) and ${\kappa}$-Nearest Neighbour. The Leave-one-out cross validation was used as an evaluation mehtod. Based on experimental results, the most efficient features were a combination of 4 features consisting of (i) the mean of the first differences of the smoothed pulse rate time series signal, (ii) the mean of absolute values of the second differences of thel normalized interbeat intervals, (iii) the root mean square successive difference, and (iv) the power in high frequency range in normalized unit, which provided 80.8% average accuracy with ${\kappa}$-Nearest Neighbour classifier.

  • PDF

Tomato Crop Disease Classification Using an Ensemble Approach Based on a Deep Neural Network (심층 신경망 기반의 앙상블 방식을 이용한 토마토 작물의 질병 식별)

  • Kim, Min-Ki
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.10
    • /
    • pp.1250-1257
    • /
    • 2020
  • The early detection of diseases is important in agriculture because diseases are major threats of reducing crop yield for farmers. The shape and color of plant leaf are changed differently according to the disease. So we can detect and estimate the disease by inspecting the visual feature in leaf. This study presents a vision-based leaf classification method for detecting the diseases of tomato crop. ResNet-50 model was used to extract the visual feature in leaf and classify the disease of tomato crop, since the model showed the higher accuracy than the other ResNet models with different depths. We propose a new ensemble approach using several DCNN classifiers that have the same structure but have been trained at different ranges in the DCNN layers. Experimental result achieved accuracy of 97.19% for PlantVillage dataset. It validates that the proposed method effectively classify the disease of tomato crop.

Comparison of Fuzzy Classifiers Based on Fuzzy Membership Functions : Applies to Satellite Landsat TM Image

  • Kim Jin Il;Jeon Young Joan;Choi Young Min
    • Proceedings of the IEEK Conference
    • /
    • 2004.08c
    • /
    • pp.842-845
    • /
    • 2004
  • The aim of this study is to compare the classification results for choosing the fuzzy membership function within fuzzy rules. There are various methods of extracting rules from training data in the process of fuzzy rules generation. Pattern distribution characteristics are considered to produce fuzzy rules. The accuracy of classification results are depended on not only considering the characteristics of fuzzy subspaces but also choosing the fuzzy membership functions. This paper shows how to produce various type of fuzzy rules from the partitioning the pattern spaces and results of land cover classification in satellite remote sensing images by adopting various fuzzy membership functions. The experiments of this study is applied to Landsat TM image and the results of classification are compared by fuzzy membership functions.

  • PDF

An Efficient Character Recognition Algorithm in Printed Korean/English Documents Including Touching Characters (붙은 글자들이 포함된 인쇄체 한.영 혼용 문서에서의 효과적인 문자 인식 알고리즘)

  • 김규경;김진호;진성일;최흥문
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.11
    • /
    • pp.116-126
    • /
    • 1996
  • In this paper, we present a character recognition algorithm in printed korean and english documents including touching characters. We derived two rules to segment and recognize touching characters in the bilingual documents, one from the shape characteristics of korean and english characters of the writing blocks defined in this paper, and the other from the RF (reliability factor) values generated from the classifiers. Overall classification accuracy for the KITE paper of the proposed algorithm was about 96.8% for the english abstract, and about 97.8% for the bilingual parts. Also we confirmed the proposed algorithm significantly improves the accuracy of character segmentation of the actual mixed korean and english documents including touching characters.

  • PDF