• Title/Summary/Keyword: 다차원 축소

Search Result 32, Processing Time 0.027 seconds

Evaluation of Multivariate Stream Data Reduction Techniques (다변량 스트림 데이터 축소 기법 평가)

  • Jung, Hung-Jo;Seo, Sung-Bo;Cheol, Kyung-Joo;Park, Jeong-Seok;Ryu, Keun-Ho
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.889-900
    • /
    • 2006
  • Even though sensor networks are different in user requests and data characteristics depending on each application area, the existing researches on stream data transmission problem focus on the performance improvement of their methods rather than considering the original characteristic of stream data. In this paper, we introduce a hierarchical or distributed sensor network architecture and data model, and then evaluate the multivariate data reduction methods suitable for user requirements and data features so as to apply reduction methods alternatively. To assess the relative performance of the proposed multivariate data reduction methods, we used the conventional techniques, such as Wavelet, HCL(Hierarchical Clustering), Sampling and SVD (Singular Value Decomposition) as well as the experimental data sets, such as multivariate time series, synthetic data and robot execution failure data. The experimental results shows that SVD and Sampling method are superior to Wavelet and HCL ia respect to the relative error ratio and execution time. Especially, since relative error ratio of each data reduction method is different according to data characteristic, it shows a good performance using the selective data reduction method for the experimental data set. The findings reported in this paper can serve as a useful guideline for sensor network application design and construction including multivariate stream data.

Indexing and Searching for Reduced-Dimensional Vectors (차원 축소 벡터들을 위한 인덱싱 및 검색)

  • Jeong, Seung-Do;Kim, Sang-Wook;Choi, Byung-Uk
    • Journal of KIISE:Databases
    • /
    • v.37 no.1
    • /
    • pp.44-49
    • /
    • 2010
  • In this paper, we first address the problems associated with indexing and searching for reduced-dimensional vectors, which are reduced by using a combination of angle approximation and dimension grouping. Then, we propose a novel method to solve the problems. We also show the superiority of the proposed method by performing extensive experiments with synthetic and real-life data sets.

Data Preprocessing Techniques for Visualizing Gas Sensor Datasets (가스 센서 데이터셋 시각화를 위한 데이터 전처리 기법)

  • Kim, Junsu;Park, Kyungwon;Lim, Taebum;Park, Gooman
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • fall
    • /
    • pp.21-22
    • /
    • 2021
  • 최근 AI(Artificial Intelligence)를 기반으로 정밀한 가스 성분 감지를 위한 후각지능(Olfactory intelligence) 기술에 연구가 활발히 진행 중이다. 후각지능 학습데이터는 다른 감지 방식의 가스 센서들이 동시에 적용되는 멀티모달리티의 특성을 지니며 또한, 공간상에 분포된 센서 배열을 통해 획득된 다차원의 시계열 특성을 지닌다. 따라서 대량의 다차원 데이터에 대한 정확한 이해와 분석을 위해서는 데이터를 전처리하고 시각화할 수 있는 기술이 필요하다. 본 논문에서는 후각지능 학습을 위한 다차원의 복잡한 가스 데이터의 시각화를 위해 잡음 등의 불필요한 값을 제거하고, 데이터가 일관성을 가지도록 하며, 데이터의 차원을 시각화 가능하도록 축소하기 위한 전처리 방법을 제시한다.

  • PDF

A review on the t-distributed stochastic neighbors embedding (t-SNE에 대한 요약)

  • Kipoong Kim;Choongrak Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.167-173
    • /
    • 2023
  • This paper investigates several methods of visualizing high-dimensional data in a low-dimensional space. At first, principal component analysis and multidimensional scaling are briefly introduced as linear approaches, and then kernel principal component analysis, self-organizing map, locally linear embedding, Isomap, Laplacian Eigenmaps, and local multidimensional scaling are introduced as nonlinear approaches. In particular, t-SNE, which is widely used but relatively unfamiliar in the field of statistics, is described in more detail. We also present a simple example for several methods, including t-SNE. Finally, we provide a review of several recent studies pointing out the limitations of t-SNE and discuss the future research problems presented.

Reducing the Number of Hidden Nodes in MLP using the Vertex of Hidden Layer's Hypercube (은닉층 다차원공간의 Vertex를 이용한 MLP의 은닉 노드 축소방법)

  • 곽영태;이영직;권오석
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.9B
    • /
    • pp.1775-1784
    • /
    • 1999
  • This paper proposes a method of removing unnecessary hidden nodes by a new cost function that evaluates the variance and the mean of hidden node outputs during training. The proposed cost function makes necessary hidden nodes be activated and unnecessary hidden nodes be constants. We can remove the constant hidden nodes without performance degradation. Using the CEDAR handwritten digit recognition, we have shown that the proposed method can remove the number of hidden nodes up to 37.2%, with higher recognition rate and shorter learning time.

  • PDF

A dimensional reduction method in cluster analysis for multidimensional data: principal component analysis and factor analysis comparison (다차원 데이터의 군집분석을 위한 차원축소 방법: 주성분분석 및 요인분석 비교)

  • Hong, Jun-Ho;Oh, Min-Ji;Cho, Yong-Been;Lee, Kyung-Hee;Cho, Wan-Sup
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.135-143
    • /
    • 2020
  • This paper proposes a pre-processing method and a dimensional reduction method in the analysis of shopping carts where there are many correlations between variables when dividing the types of consumers in the agri-food consumer panel data. Cluster analysis is a widely used method for dividing observational objects into several clusters in multivariate data. However, cluster analysis through dimensional reduction may be more effective when several variables are related. In this paper, the food consumption data surveyed of 1,987 households was clustered using the K-means method, and 17 variables were re-selected to divide it into the clusters. Principal component analysis and factor analysis were compared as the solution for multicollinearity problems and as the way to reduce dimensions for clustering. In this study, both principal component analysis and factor analysis reduced the dataset into two dimensions. Although the principal component analysis divided the dataset into three clusters, it did not seem that the difference among the characteristics of the cluster appeared well. However, the characteristics of the clusters in the consumption pattern were well distinguished under the factor analysis method.

Dimension Reduction Method of Speech Feature Vector for Real-Time Adaptation of Voice Activity Detection (음성구간 검출기의 실시간 적응화를 위한 음성 특징벡터의 차원 축소 방법)

  • Park Jin-Young;Lee Kwang-Seok;Hur Kang-In
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.7 no.3
    • /
    • pp.116-121
    • /
    • 2006
  • In this paper, we propose the dimension reduction method of multi-dimension speech feature vector for real-time adaptation procedure in various noisy environments. This method which reduces dimensions non-linearly to map the likelihood of speech feature vector and noise feature vector. The LRT(Likelihood Ratio Test) is used for classifying speech and non-speech. The results of implementation are similar to multi-dimensional speech feature vector. The results of speech recognition implementation of detected speech data are also similar to multi-dimensional(10-order dimensional MFCC(Mel-Frequency Cepstral Coefficient)) speech feature vector.

  • PDF

A study on reduction of sensibility dimension for selection of wallpaper (벽지 선택을 위한 감성 차원 축소에 관한 연구)

  • Chun Young-Min;Kim Soon-Young;Kim Sung-Hwan;Chung Sung-Suk
    • Science of Emotion and Sensibility
    • /
    • v.8 no.4
    • /
    • pp.333-344
    • /
    • 2005
  • The sensitivity adjectives on wall paper are collected. With the collected sensitivity adjective, we are going to develop the model which can recommend the wallpaper to customer. A large number of adjectives describing affective responses were collected from such diverse sources as questionnaire survey results, field survey results and internet survey result. To search the representative adjective of collected adjective, we used the diverse statistical analysis method. We attempted to decide the axis name of dimension through the MDS(Multi-Dimensional Scale) analysis method using the similarity matrix an4 to find a three or four reduced factors through the factor analysis method using the varimax rotation method. The result of the analysis showed that the reduced factors could account about $82\%$ when the number of factor is three(popular, elegance, and passable) ant about $93\%$ when the number of factor is four (elegance, passable, beautiful, and affectionate) On the basis of this result, we expect it can be used to develop the model recommending the wallpaper.

  • PDF

Study on Dimension Reduction algorithm for unsupervised clustering of the DMR's RF-fingerprinting features (무선단말기 RF-fingerprinting 특징의 비지도 클러스터링을 위한 차원축소 알고리즘 연구)

  • Young-Giu Jung;Hak-Chul Shin;Sun-Phil Nah
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.23 no.3
    • /
    • pp.83-89
    • /
    • 2023
  • The clustering technique using RF fingerprint extracts the characteristic signature of the transmitters which are embedded in the transmission waveforms. The output of the RF-Fingerprint feature extraction algorithm for clustering identical DMR(Digital Mobile Radios) is a high-dimensional feature, typically consisting of 512 or more dimensions. While such high-dimensional features may be effective for the classifiers, they are not suitable to be used as inputs for the clustering algorithms. Therefore, this paper proposes a dimension reduction algorithm that effectively reduces the dimensionality of the multidimensional RF-Fingerprint features while maintaining the fingerprinting characteristics of the DMRs. Additionally, it proposes a clustering algorithm that can effectively cluster the reduced dimensions. The proposed clustering algorithm reduces the multi-dimensional RF-Fingerprint features using t-SNE, based on KL Divergence, and performs clustering using Density Peaks Clustering (DPC). The performance analysis of the DMR clustering algorithm uses a dataset of 3000 samples collected from 10 Motorola XiR and 10 Wintech N-Series DMRs. The results of the RF-Fingerprinting-based clustering algorithm showed the formation of 20 clusters, and all performance metrics including Homogeneity, Completeness, and V-measure, demonstrated a performance of 99.4%.

Efficient Processing of Multidimensional Sensor stream Data in Digital Marine Vessel (디지털 선박 내 다차원 센서 스트림 데이터의 효율적인 처리)

  • Song, Byoung-Ho;Park, Kyung-Woo;Lee, Jin-Seok;Lee, Keong-Hyo;Jung, Min-A;Lee, Sung-Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.5B
    • /
    • pp.794-800
    • /
    • 2010
  • It is necessary to accurate and efficient management for measured digital data from various sensors in digital marine vessel. It is not efficient that sensor network process input stream data of mass storage stored in database the same time. In this paper, We propose to improve the processing performance of multidimensional stream data continuous incoming from multiple sensor. We propose that we arrange some sensors (temperature, humidity, lighting, voice) and process query based on sliding window for efficient input stream and found multiple query plan to Mjoin method and we reduce stored data using SVM algorithm. We automatically delete that it isn't necessary to the data from the database and we used to ship diagnosis system for available data. As a result, we obtained to efficient result about 18.3% reduction rate of database using 35,912 data sets.