• Title, Summary, Keyword: Dimensionality Reduction

Search Result 165, Processing Time 0.052 seconds

Dimensionality reduction for pattern recognition based on difference of distribution among classes

  • Nishimura, Masaomi;Hiraoka, Kazuyuki;Mishima, Taketoshi
    • Proceedings of the IEEK Conference
    • /
    • /
    • pp.1670-1673
    • /
    • 2002
  • For pattern recognition on high-dimensional data, such as images, the dimensionality reduction as a preprocessing is effective. By dimensionality reduction, we can (1) reduce storage capacity or amount of calculation, and (2) avoid "the curse of dimensionality" and improve classification performance. Popular tools for dimensionality reduction are Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis (ICA) recently. Among them, only LDA takes the class labels into consideration. Nevertheless, it, has been reported that, the classification performance with ICA is better than that with LDA because LDA has restriction on the number of dimensions after reduction. To overcome this dilemma, we propose a new dimensionality reduction technique based on an information theoretic measure for difference of distribution. It takes the class labels into consideration and still it does not, have restriction on number of dimensions after reduction. Improvement of classification performance has been confirmed experimentally.

  • PDF

Data Visualization using Linear and Non-linear Dimensionality Reduction Methods

  • Kim, Junsuk;Youn, Joosang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.12
    • /
    • pp.21-26
    • /
    • 2018
  • As the large amount of data can be efficiently stored, the methods extracting meaningful features from big data has become important. Especially, the techniques of converting high- to low-dimensional data are crucial for the 'Data visualization'. In this study, principal component analysis (PCA; linear dimensionality reduction technique) and Isomap (non-linear dimensionality reduction technique) are introduced and applied to neural big data obtained by the functional magnetic resonance imaging (fMRI). First, we investigate how much the physical properties of stimuli are maintained after the dimensionality reduction processes. We moreover compared the amount of residual variance to quantitatively compare the amount of information that was not explained. As result, the dimensionality reduction using Isomap contains more information than the principal component analysis. Our results demonstrate that it is necessary to consider not only linear but also nonlinear characteristics in the big data analysis.

A Novel Speech/Music Discrimination Using Feature Dimensionality Reduction

  • Keum, Ji-Soo;Lee, Hyon-Soo;Hagiwara, Masafumi
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.10 no.1
    • /
    • pp.7-11
    • /
    • 2010
  • In this paper, we propose an improved speech/music discrimination method based on a feature combination and dimensionality reduction approach. To improve discrimination ability, we use a feature based on spectral duration analysis and employ the hierarchical dimensionality reduction (HDR) method to reduce the effect of correlated features. Through various kinds of experiments on speech and music, it is shown that the proposed method showed high discrimination results when compared with conventional methods.

Design of Gas Identification System with Hierarchically Identifiable Rule base using GAS and Rough Sets (유전알고리즘과 러프집합을 이용한 계층적 식별 규칙을 갖는 가스 식별 시스템의 설계)

  • Haibo, Zhao;Bang, Young-Keun;Lee, Chul-Heui
    • Journal of Industrial Technology
    • /
    • v.31 no.B
    • /
    • pp.37-43
    • /
    • 2011
  • In pattern analysis, dimensionality reduction and reasonable identification rule generation are very important parts. This paper performed effectively the dimensionality reduction by grouping the sensors of which the measured patterns are similar each other, where genetic algorithms were used for combination optimization. To identify the gas type, this paper constructed the hierarchically identifiable rule base with two frames by using rough set theory. The first frame is to accept measurement characteristics of each sensor and the other one is to reflect the identification patterns of each group. Thus, the proposed methods was able to accomplish effectively dimensionality reduction as well as accurate gas identification. In simulation, this paper demonstrated the effectiveness of the proposed methods by identifying five types of gases.

  • PDF

Boosting Multifactor Dimensionality Reduction Using Pre-evaluation

  • Hong, Yingfu;Lee, Sangbum;Oh, Sejong
    • ETRI Journal
    • /
    • v.38 no.1
    • /
    • pp.206-215
    • /
    • 2016
  • The detection of gene-gene interactions during genetic studies of common human diseases is important, and the technique of multifactor dimensionality reduction (MDR) has been widely applied to this end. However, this technique is not free from the "curse of dimensionality" -that is, it works well for two- or three-way interactions but requires a long execution time and extensive computing resources to detect, for example, a 10-way interaction. Here, we propose a boosting method to reduce MDR execution time. With the use of pre-evaluation measurements, gene sets with low levels of interaction can be removed prior to the application of MDR. Thus, the problem space is decreased and considerable time can be saved in the execution of MDR.

Dimensionality Reduction in Speech Recognition by Principal Component Analysis (음성인식에서 주 성분 분석에 의한 차원 저감)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.8 no.9
    • /
    • pp.1299-1305
    • /
    • 2013
  • In this paper, we investigate a method of reducing the computational cost in speech recognition by dimensionality reduction of MFCC feature vectors. Eigendecomposition of the feature vectors renders linear transformation of the vectors in such a way that puts the vector components in order of variances. The first component has the largest variance and hence serves as the most important one in relevant pattern classification. Therefore, we might consider a method of reducing the computational cost and achieving no degradation of the recognition performance at the same time by dimensionality reduction through exclusion of the least-variance components. Experimental results show that the MFCC components might be reduced by about half without significant adverse effect on the recognition error rate.

An Effective Method for Dimensionality Reduction in High-Dimensional Space (고차원 공간에서 효과적인 차원 축소 기법)

  • Jeong Seung-Do;Kim Sang-Wook;Choi Byung-Uk
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.4
    • /
    • pp.88-102
    • /
    • 2006
  • In multimedia information retrieval, multimedia data are represented as vectors in high dimensional space. To search these vectors effectively, a variety of indexing methods have been proposed. However, the performance of these indexing methods degrades dramatically with increasing dimensionality, which is known as the dimensionality curse. To resolve the dimensionality curse, dimensionality reduction methods have been proposed. They map feature vectors in high dimensional space into the ones in low dimensional space before indexing the data. This paper proposes a method for dimensionality reduction based on a function approximating the Euclidean distance, which makes use of the norm and angle components of a vector. First, we identify the causes of the errors in angle estimation for approximating the Euclidean distance, and discuss basic directions to reduce those errors. Then, we propose a novel method for dimensionality reduction that composes a set of subvectors from a feature vector and maintains only the norm and the estimated angle for every subvector. The selection of a good reference vector is important for accurate estimation of the angle component. We present criteria for being a good reference vector, and propose a method that chooses a good reference vector by using Levenberg-Marquardt algorithm. Also, we define a novel distance function, and formally prove that the distance function lower-bounds the Euclidean distance. This implies that our approach does not incur any false dismissals in reducing the dimensionality effectively. Finally, we verify the superiority of the proposed method via performance evaluation with extensive experiments.

Design of Gas Identification System with Hierarchical Rule base using Genetic Algorithms and Rough Sets (유전 알고리즘과 러프 집합을 이용한 계층적 식별 규칙을 갖는 가스 식별 시스템의 설계)

  • Bang, Yonug-Keun;Byun, Hyung-Gi;Lee, Chul-Heui
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.8
    • /
    • pp.1164-1171
    • /
    • 2012
  • Recently, machine olfactory systems as an artificial substitute of the human olfactory system are being studied actively because they can scent dangerous gases and identify the type of gases in contamination areas instead of the human. In this paper, we present an effective design method for the gas identification system. Even though dimensionality reduction is the very important part, in pattern analysis, We handled effectively the dimensionality reduction by grouping the sensors of which the measured patterns are similar each other, where genetic algorithms were used for combination optimization. To identify the gas type, we constructed the hierarchical rule base with two frames by using rough set theory. The first frame is to accept measurement characteristics of each sensor and the other one is to reflect the identification patterns of each group. Thus, the proposed methods was able to accomplish effectively dimensionality reduction as well as accurate gas identification. In simulation, we demonstrated the effectiveness of the proposed methods by identifying five types of gases.

Performance evaluation of principal component analysis for clustering problems

  • Kim, Jae-Hwan;Yang, Tae-Min;Kim, Jung-Tae
    • Journal of the Korean Society of Marine Engineering
    • /
    • v.40 no.8
    • /
    • pp.726-732
    • /
    • 2016
  • Clustering analysis is widely used in data mining to classify data into categories on the basis of their similarity. Through the decades, many clustering techniques have been developed, including hierarchical and non-hierarchical algorithms. In gene profiling problems, because of the large number of genes and the complexity of biological networks, dimensionality reduction techniques are critical exploratory tools for clustering analysis of gene expression data. Recently, clustering analysis of applying dimensionality reduction techniques was also proposed. PCA (principal component analysis) is a popular methd of dimensionality reduction techniques for clustering problems. However, previous studies analyzed the performance of PCA for only full data sets. In this paper, to specifically and robustly evaluate the performance of PCA for clustering analysis, we exploit an improved FCBF (fast correlation-based filter) of feature selection methods for supervised clustering data sets, and employ two well-known clustering algorithms: k-means and k-medoids. Computational results from supervised data sets show that the performance of PCA is very poor for large-scale features.