• Title/Summary/Keyword: PCA(Principal Component Analysis

Search Result 1,243, Processing Time 0.026 seconds

The Design of GA-based TSK Fuzzy Classifier and Its application (GA기반 TSK 퍼지 분류기의 설계 및 응용)

  • 곽근창;김승석;유정웅;전명근
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2001.12a
    • /
    • pp.233-236
    • /
    • 2001
  • In this paper, we propose a TSK-type fuzzy classifier using PCA(Principal Component Analysis), FCM(Fuzzy C-Means) clustering and hybrid GA(genetic algorithm). First, input data is transformed to reduce correlation among the data components by PCA. FCM clustering is applied to obtain a initial TSK-type fuzzy classifier. Parameter identification is performed by AGA(Adaptive Genetic Algorithm) and RLSE(Recursive Least Square Estimate). we applied the proposed method to Iris data classification problems and obtained a better performance than previous works.

  • PDF

Speaker Identification Using Greedy Kernel PCA (Greedy Kernel PCA를 이용한 화자식별)

  • Kim, Min-Seok;Yang, Il-Ho;Yu, Ha-Jin
    • MALSORI
    • /
    • no.66
    • /
    • pp.105-116
    • /
    • 2008
  • In this research, we propose a speaker identification system using a kernel method which is expected to model the non-linearity of speech features well. We have been using principal component analysis (PCA) successfully, and extended to kernel PCA, which is used for many pattern recognition tasks such as face recognition. However, we cannot use kernel PCA for speaker identification directly because the storage required for the kernel matrix grows quadratically, and the computational cost grows linearly (computing eigenvector of $l{\times}l$ matrix) with the number of training vectors I. Therefore, we use greedy kernel PCA which can approximate kernel PCA with small representation error. In the experiments, we compare the accuracy of the greedy kernel PCA with the baseline Gaussian mixture models using MFCCs and PCA. As the results with limited enrollment data show, the greedy kernel PCA outperforms conventional methods.

  • PDF

Input Variables Selection by Principal Component Analysis and Mutual Information Estimation (주요성분분석과 상호정보 추정에 의한 입력변수선택)

  • Cho, Yong-Hyun;Hong, Seong-Jun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.2
    • /
    • pp.220-225
    • /
    • 2007
  • This paper presents an efficient input variable selection method using both principal component analysis(PCA) and adaptive partition mutual information(AP-MI) estimation. PCA which is based on 2nd order statistics, is applied to prevent a overestimation by quickly removing the dependence between input variables. AP-MI estimation is also applied to estimate an accurate dependence information by equally partitioning the samples of input variable for calculating the probability density function. The proposed method has been applied to 2 problems for selecting the input variables, which are the 7 artificial signals of 500 samples and the 24 environmental pollution signals of 55 samples, respectively. The experimental results show that the proposed methods has a fast and accurate selection performance. The proposed method has also respectively better performance than AP-MI estimation without the PCA and regular partition MI estimation.

Detecting outliers in multivariate data and visualization-R scripts (다변량 자료에서 특이점 검출 및 시각화 - R 스크립트)

  • Kim, Sung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.517-528
    • /
    • 2018
  • We provide R scripts to detect outliers in multivariate data and visualization. Detecting outliers is provided using three approaches 1) Robust Mahalanobis distance, 2) High Dimensional data, 3) density-based approach methods. We use the following techniques to visualize detected potential outliers 1) multidimensional scaling (MDS) and minimal spanning tree (MST) with k-means clustering, 2) MDS with fviz cluster, 3) principal component analysis (PCA) with fviz cluster. For real data sets, we use MLB pitching data including Ryu, Hyun-jin in 2013 and 2014. The developed R scripts can be downloaded at "http://www.knou.ac.kr/~sskim/ddpoutlier.html" (R scripts and also R package can be downloaded here).

Automatic e-mail Hierarchy Classification using Dynamic Category Hierarchy and Principal Component Analysis (PCA와 동적 분류체계를 사용한 자동 이메일 계층 분류)

  • Park, Sun
    • Journal of Advanced Navigation Technology
    • /
    • v.13 no.3
    • /
    • pp.419-425
    • /
    • 2009
  • The amount of incoming e-mails is increasing rapidly due to the wide usage of Internet. Therefore, it is more required to classify incoming e-mails efficiently and accurately. Currently, the e-mail classification techniques are focused on two way classification to filter spam mails from normal ones based mainly on Bayesian and Rule. The clustering method has been used for the multi-way classification of e-mails. But it has a disadvantage of low accuracy of classification and no category labels. The classification methods have a disadvantage of training and setting of category labels by user. In this paper, we propose a novel multi-way e-mail hierarchy classification method that uses PCA for automatic category generation and dynamic category hierarchy for high accuracy of classification. It classifies a huge amount of incoming e-mails automatically, efficiently, and accurately.

  • PDF

Measurement of the Visibility of the Smoke Images using PCA (PCA를 이용한 연기 영상의 가시도 측정)

  • Yu, Young-Jung;Moon, Sang-ho;Park, Seong-Ho
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.11
    • /
    • pp.1474-1480
    • /
    • 2018
  • When fires occur in high-rise buildings, it is difficult to determine whether each escape route is safe because of complex structure. Therefore, it is necessary to provide residents with escape routes quickly after determining their safety. We propose a method to measure the visibility of the escape route due to the smoke generated in the fire by analyzing the images. The visibility can be easily measured if the density of smoke detected in the input image is known. However, this approach is difficult to use because there are no suitable methods for measuring smoke density. In this paper, we use principal component analysis by extracting a background image from input images and making it training data. Background images and smoke images are extracted from images given as inputs, and then the learned principal component analysis is applied to map of as a new feature space, and the change is calculated and the visibility due to the smoke is measured.

Light-weight Classification Model for Android Malware through the Dimensional Reduction of API Call Sequence using PCA

  • Jeon, Dong-Ha;Lee, Soo-Jin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.123-130
    • /
    • 2022
  • Recently, studies on the detection and classification of Android malware based on API Call sequence have been actively carried out. However, API Call sequence based malware classification has serious limitations such as excessive time and resource consumption in terms of malware analysis and learning model construction due to the vast amount of data and high-dimensional characteristic of features. In this study, we analyzed various classification models such as LightGBM, Random Forest, and k-Nearest Neighbors after significantly reducing the dimension of features using PCA(Principal Component Analysis) for CICAndMal2020 dataset containing vast API Call information. The experimental result shows that PCA significantly reduces the dimension of features while maintaining the characteristics of the original data and achieves efficient malware classification performance. Both binary classification and multi-class classification achieve higher levels of accuracy than previous studies, even if the data characteristics were reduced to less than 1% of the total size.

Efficient Primary-Ambient Decomposition Algorithm for Audio Upmix (오디오 업믹스를 위한 효율적인 주성분-주변성분 분리 알고리즘)

  • Baek, Yong-Hyun;Jeon, Se-Woon;Lee, Seok-Pil;Park, Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.17 no.6
    • /
    • pp.924-932
    • /
    • 2012
  • Decomposition of a stereo signal into the primary and ambient components is a key step to the stereo upmix and it is often based on the principal component analysis (PCA). However, major shortcoming of the PCA-based method is that accuracy of the decomposed components is dependent on both the primary-to-ambient power ratio (PAR) and the panning angle. Previously, a modified PCA was suggested to solve the PAR-dependent problem. However, its performance is still dependent on the panning angle of the primary signal. In this paper, we proposed a new PCA-based primary-ambient decomposition algorithm whose performance is not affected by the PAR as well as the panning angle. The proposed algorithm finds scale factors based on a criterion that is set to preserve the powers of the mixed components, so that the original primary and ambient powers are correctly retrieved. Simulation results are presented to show the effectiveness of the proposed algorithm.

Design of pRBFNNs Pattern Classifier-based Face Recognition System Using 2-Directional 2-Dimensional PCA Algorithm ((2D)2PCA 알고리즘을 이용한 pRBFNNs 패턴분류기 기반 얼굴인식 시스템 설계)

  • Oh, Sung-Kwun;Jin, Yong-Tak
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.1
    • /
    • pp.195-201
    • /
    • 2014
  • In this study, face recognition system was designed based on polynomial Radial Basis Function Neural Networks(pRBFNNs) pattern classifier using 2-directional 2-dimensional principal component analysis algorithm. Existing one dimensional PCA leads to the reduction of dimension of image expressed by the multiplication of rows and columns. However $(2D)^2PCA$(2-Directional 2-Dimensional Principal Components Analysis) is conducted to reduce dimension to each row and column of image. and then the proposed intelligent pattern classifier evaluates performance using reduced images. The proposed pRBFNNs consist of three functional modules such as the condition part, the conclusion part, and the inference part. In the condition part of fuzzy rules, input space is partitioned with the aid of fuzzy c-means clustering. In the conclusion part of rules. the connection weight of RBFNNs is represented as the linear type of polynomial. The essential design parameters (including the number of inputs and fuzzification coefficient) of the networks are optimized by means of Differential Evolution. Using Yale and AT&T dataset widely used in face recognition, the recognition rate is obtained and evaluated. Additionally IC&CI Lab dataset is experimented with for performance evaluation.

Efficient Speaker Identification based on Robust VQ-PCA (강인한 VQ-PCA에 기반한 효율적인 화자 식별)

  • Lee Ki-Yong
    • Journal of Internet Computing and Services
    • /
    • v.5 no.3
    • /
    • pp.57-62
    • /
    • 2004
  • In this paper, an efficient speaker identification based on robust vector quantizationprincipal component analysis (VQ-PCA) is proposed to solve the problems from outliers and high dimensionality of training feature vectors in speaker identification, Firstly, the proposed method partitions the data space into several disjoint regions by roust VQ based on M-estimation. Secondly, the robust PCA is obtained from the covariance matrix in each region. Finally, our method obtains the Gaussian Mixture model (GMM) for speaker from the transformed feature vectors with reduced dimension by the robust PCA in each region, Compared to the conventional GMM with diagonal covariance matrix, under the same performance, the proposed method gives faster results with less storage and, moreover, shows robust performance to outliers.

  • PDF