• Title/Summary/Keyword: classifiers

Search Result 738, Processing Time 0.048 seconds

Wavelet based Fuzzy Integral System for 3D Face Recognition (퍼지적분을 이용한 웨이블릿 기반의 3차원 얼굴 인식)

  • Lee, Yeung-Hak;Shim, Jae-Chang
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.10
    • /
    • pp.616-626
    • /
    • 2008
  • The face shape extracted by the depth values has different appearance as the most important facial feature information and the face images decomposed into frequency subband are signified personal features in detail. In this paper, we develop a method for recognizing the range face images by combining the multiple frequency domains for each depth image and depth fusion using fuzzy integral. For the proposed approach, the first step tries to find the nose tip that has a protrusion shape on the face from the extracted face area. It is used as the reference point to normalize for orientated facial pose and extract multiple areas by the depth threshold values. In the second step, we adopt as features for the authentication problem the wavelet coefficient extracted from some wavelet subband to use feature information. The third step of approach concerns the application of eigenface and Linear Discriminant Analysis (LDA) method to reduce the dimension and classify. In the last step, the aggregation of the individual classifiers using the fuzzy integral is explained for extracted coefficient at each resolution level. In the experimental results, using the depth threshold value 60 (DT60) show the highest recognition rate among the regions, and the depth fusion method achieves 98.6% recognition rate, incase of fuzzy integral.

Improving Naïve Bayes Text Classifiers with Incremental Feature Weighting (점진적 특징 가중치 기법을 이용한 나이브 베이즈 문서분류기의 성능 개선)

  • Kim, Han-Joon;Chang, Jae-Young
    • The KIPS Transactions:PartB
    • /
    • v.15B no.5
    • /
    • pp.457-464
    • /
    • 2008
  • In the real-world operational environment, most of text classification systems have the problems of insufficient training documents and no prior knowledge of feature space. In this regard, $Na{\ddot{i}ve$ Bayes is known to be an appropriate algorithm of operational text classification since the classification model can be evolved easily by incrementally updating its pre-learned classification model and feature space. This paper proposes the improving technique of $Na{\ddot{i}ve$ Bayes classifier through feature weighting strategy. The basic idea is that parameter estimation of $Na{\ddot{i}ve$ Bayes considers the degree of feature importance as well as feature distribution. We can develop a more accurate classification model by incorporating feature weights into Naive Bayes learning algorithm, not performing a learning process with a reduced feature set. In addition, we have extended a conventional feature update algorithm for incremental feature weighting in a dynamic operational environment. To evaluate the proposed method, we perform the experiments using the various document collections, and show that the traditional $Na{\ddot{i}ve$ Bayes classifier can be significantly improved by the proposed technique.

Application of Random Over Sampling Examples(ROSE) for an Effective Bankruptcy Prediction Model (효과적인 기업부도 예측모형을 위한 ROSE 표본추출기법의 적용)

  • Ahn, Cheolhwi;Ahn, Hyunchul
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.525-535
    • /
    • 2018
  • If the frequency of a particular class is excessively higher than the frequency of other classes in the classification problem, data imbalance problems occur, which make machine learning distorted. Corporate bankruptcy prediction often suffers from data imbalance problems since the ratio of insolvent companies is generally very low, whereas the ratio of solvent companies is very high. To mitigate these problems, it is required to apply a proper sampling technique. Until now, oversampling techniques which adjust the class distribution of a data set by sampling minor class with replacement have popularly been used. However, they are a risk of overfitting. Under this background, this study proposes ROSE(Random Over Sampling Examples) technique which is proposed by Menardi and Torelli in 2014 for the effective corporate bankruptcy prediction. The ROSE technique creates new learning samples by synthesizing the samples for learning, so it leads to better prediction accuracy of the classifiers while avoiding the risk of overfitting. Specifically, our study proposes to combine the ROSE method with SVM(support vector machine), which is known as the best binary classifier. We applied the proposed method to a real-world bankruptcy prediction case of a Korean major bank, and compared its performance with other sampling techniques. Experimental results showed that ROSE contributed to the improvement of the prediction accuracy of SVM in bankruptcy prediction compared to other techniques, with statistical significance. These results shed a light on the fact that ROSE can be a good alternative for resolving data imbalance problems of the prediction problems in social science area other than bankruptcy prediction.

3D Face Recognition using Wavelet Transform Based on Fuzzy Clustering Algorithm (펴지 군집화 알고리즘 기반의 웨이블릿 변환을 이용한 3차원 얼굴 인식)

  • Lee, Yeung-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.11
    • /
    • pp.1501-1514
    • /
    • 2008
  • The face shape extracted by the depth values has different appearance as the most important facial information. The face images decomposed into frequency subband are signified personal features in detail. In this paper, we develop a method for recognizing the range face images by multiple frequency domains for each depth image using the modified fuzzy c-mean algorithm. For the proposed approach, the first step tries to find the nose tip that has a protrusion shape on the face from the extracted face area. And the second step takes into consideration of the orientated frontal posture to normalize. Multiple contour line areas which have a different shape for each person are extracted by the depth threshold values from the reference point, nose tip. And then, the frequency component extracted from the wavelet subband can be adopted as feature information for the authentication problems. The third step of approach concerns the application of eigenface to reduce the dimension. And the linear discriminant analysis (LDA) method to improve the classification ability between the similar features is adapted. In the last step, the individual classifiers using the modified fuzzy c-mean method based on the K-NN to initialize the membership degree is explained for extracted coefficient at each resolution level. In the experimental results, using the depth threshold value 60 (DT60) showed the highest recognition rate among the extracted regions, and the proposed classification method achieved 98.3% recognition rate, incase of fuzzy cluster.

  • PDF

Scalable and Accurate Intrusion Detection using n-Gram Augmented Naive Bayes and Generalized k-Truncated Suffix Tree (N-그램 증강 나이브 베이스 알고리즘과 일반화된 k-절단 서픽스트리를 이용한 확장가능하고 정확한 침입 탐지 기법)

  • Kang, Dae-Ki;Hwang, Gi-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.4
    • /
    • pp.805-812
    • /
    • 2009
  • In many intrusion detection applications, n-gram approach has been widely applied. However, n-gram approach has shown a few problems including unscalability and double counting of features. To address those problems, we applied n-gram augmented Naive Bayes with k-truncated suffix tree (k-TST) storage mechanism directly to classify intrusive sequences and compared performance with those of Naive Bayes and Support Vector Machines (SVM) with n-gram features by the experiments on host-based intrusion detection benchmark data sets. Experimental results on the University of New Mexico (UNM) benchmark data sets show that the n-gram augmented method, which solves the problem of independence violation that happens when n-gram features are directly applied to Naive Bayes (i.e. Naive Bayes with n-gram features), yields intrusion detectors with higher accuracy than those from Naive Bayes with n-gram features and shows comparable accuracy to those from SVM with n-gram features. For the scalable and efficient counting of n-gram features, we use k-truncated suffix tree mechanism for storing n-gram features. With the k-truncated suffix tree storage mechanism, we tested the performance of the classifiers up to 20-gram, which illustrates the scalability and accuracy of n-gram augmented Naive Bayes with k-truncated suffix tree storage mechanism.

Detection of Clavibacter michiganensis subsp. michiganensis Assisted by Micro-Raman Spectroscopy under Laboratory Conditions

  • Perez, Moises Roberto Vallejo;Contreras, Hugo Ricardo Navarro;Herrera, Jesus A. Sosa;Avila, Jose Pablo Lara;Tobias, Hugo Magdaleno Ramirez;Martinez, Fernando Diaz-Barriga;Ramirez, Rogelio Flores;Vazquez, Angel Gabriel Rodriguez
    • The Plant Pathology Journal
    • /
    • v.34 no.5
    • /
    • pp.381-392
    • /
    • 2018
  • Clavibacter michiganensis subsp. michiganesis (Cmm) is a quarantine-worthy pest in $M{\acute{e}}xico$. The implementation and validation of new technologies is necessary to reduce the time for bacterial detection in laboratory conditions and Raman spectroscopy is an ambitious technology that has all of the features needed to characterize and identify bacteria. Under controlled conditions a contagion process was induced with Cmm, the disease epidemiology was monitored. Micro-Raman spectroscopy ($532nm\;{\lambda}$ laser) technique was evaluated its performance at assisting on Cmm detection through its characteristic Raman spectrum fingerprint. Our experiment was conducted with tomato plants in a completely randomized block experimental design (13 plants ${\times}$ 4 rows). The Cmm infection was confirmed by 16S rDNA and plants showed symptoms from 48 to 72 h after inoculation, the evolution of the incidence and severity on plant population varied over time and it kept an aggregated spatial pattern. The contagion process reached 79% just 24 days after the epidemic was induced. Micro-Raman spectroscopy proved its speed, efficiency and usefulness as a non-destructive method for the preliminary detection of Cmm. Carotenoid specific bands with wavelengths at 1146 and $1510cm^{-1}$ were the distinguishable markers. Chemometric analyses showed the best performance by the implementation of PCA-LDA supervised classification algorithms applied over Raman spectrum data with 100% of performance in metrics of classifiers (sensitivity, specificity, accuracy, negative and positive predictive value) that allowed us to differentiate Cmm from other endophytic bacteria (Bacillus and Pantoea). The unsupervised KMeans algorithm showed good performance (100, 96, 98, 91 y 100%, respectively).

Classification of Axis-symmetric Flaws with Non-Symmetric Cross-Sections using Simulated Eddy Current Testing Signals (모사 와전류 탐상신호를 이용한 비대칭 단면을 갖는 축대칭 결함의 형상분류)

  • Song, S.J.;Kim, C.H.;Shin, Y.K.;Lee, H.B.;Park, Y.W.;Yim, C.J.
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.21 no.5
    • /
    • pp.510-517
    • /
    • 2001
  • This paper describes an initial study for the application of eddy current pattern recognition approaches to more realistic flaw characterization in steam generator tubes. For this purpose, finite-element model-based theoretical eddy current testing (ECT) signals are simulated from 5 types of OD flaws with the variation in flaw size parameters and testing frequency. In addition, three kinds of software are developed for the convenience in the application of steps in pattern recognition approaches such as feature extraction feature selection and classification by probabilistic neural networks (PNNs). The cross point of the ECT signals simulated from flaws with non-symmetric cross-sections shows the deviation from the origin of the impedance plane. New features taking advantages of this phenomenon are added to complete the feature set with a total of 18 features. Then, classification with PNNs are performed based on this feature set. The PNN classifiers show high performance for the identification of symmetry in the cross-section of a flaw. However, they show very limited success in the interrogation of the sharpness of flaw tips.

  • PDF

Analysis of large-scale flood inundation area using optimal topographic factors (지형학적 인자를 이용한 광역 홍수범람 위험지역 분석)

  • Lee, Kyoungsang;Lee, Daeeop;Jung, Sungho;Lee, Giha
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.6
    • /
    • pp.481-490
    • /
    • 2018
  • Recently, the spatiotemporal patterns of flood disasters have become more complex and unpredictable due to climate change. Flood hazard map including information on flood risk level has been widely used as an unstructured measure against flooding damages. In order to product a high-precision flood hazard map by combination of hydrologic and hydraulic modeling, huge digital information such as topography, geology, climate, landuse and various database related to social economic are required. However, in some areas, especially in developing countries, flood hazard mapping is difficult or impossible and its accuracy is insufficient because such data is lacking or inaccessible. Therefore, this study suggests a method to delineate large scale flood-prone area based on topographic factors produced by linear binary classifier and ROC (Receiver Operation Characteristics) using globally-available geographic data such as ASTER or SRTM. We applied the proposed methodology to five different countries: North Korea Bangladesh, Indonesia, Thailand and Myanmar. The results show that model performances on flood area detection ranges from 38% (Bangladesh) to 78% (Thailand). The flood-prone area detection based on the topographical factors has a great advantage in order to easily distinguish the large-scale inundation-potent area using only digital elevation model (DEM) for ungauged watersheds.

A Study on Classification of CNN-based Linux Malware using Image Processing Techniques (영상처리기법을 이용한 CNN 기반 리눅스 악성코드 분류 연구)

  • Kim, Se-Jin;Kim, Do-Yeon;Lee, Hoo-Ki;Lee, Tae-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.9
    • /
    • pp.634-642
    • /
    • 2020
  • With the proliferation of Internet of Things (IoT) devices, using the Linux operating system in various architectures has increased. Also, security threats against Linux-based IoT devices are increasing, and malware variants based on existing malware are constantly appearing. In this paper, we propose a system where the binary data of a visualized Executable and Linkable Format (ELF) file is applied to Local Binary Pattern (LBP) image processing techniques and a median filter to classify malware in a Convolutional Neural Network (CNN). As a result, the original image showed the highest accuracy and F1-score at 98.77%, and reproducibility also showed the highest score at 98.55%. For the median filter, the highest precision was 99.19%, and the lowest false positive rate was 0.008%. Using the LBP technique confirmed that the overall result was lower than putting the original ELF file through the median filter. When the results of putting the original file through image processing techniques were classified by majority, it was confirmed that the accuracy, precision, F1-score, and false positive rate were better than putting the original file through the median filter. In the future, the proposed system will be used to classify malware families or add other image processing techniques to improve the accuracy of majority vote classification. Or maybe we mean "the use of Linux O/S distributions for various architectures has increased" instead? If not, please rephrase as intended.

Object/Non-object Image Classification Based on the Detection of Objects of Interest (관심 객체 검출에 기반한 객체 및 비객체 영상 분류 기법)

  • Kim Sung-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.2 s.40
    • /
    • pp.25-33
    • /
    • 2006
  • We propose a method that automatically classifies the images into the object and non-object images. An object image is the image with object(s). An object in an image is defined as a set of regions that lie around center of the image and have significant color distribution against the other surround (or background) regions. We define four measures based on the characteristics of an object to classify the images. The center significance is calculated from the difference in color distribution between the center area and its surrounding region. Second measure is the variance of significantly correlated colors in the image plane. Significantly correlated colors are first defined as the colors of two adjacent pixels that appear more frequently around center of an image rather than at the background of the image. Third one is edge strength at the boundary of candidate for the object. By the way, it is computationally expensive to extract third value because central objects are extracted. So, we define fourth measure which is similar with third measure in characteristic. Fourth one can be calculated more fast but show less accuracy than third one. To classify the images we combine each measure by training the neural network and SYM. We compare classification accuracies of these two classifiers.

  • PDF