• Title/Summary/Keyword: Data Dimension

Search Result 1,947, Processing Time 0.029 seconds

Dimension Reduction Methods on High Dimensional Streaming Data with Concept Drift (개념 변동 고차원 스트리밍 데이터에 대한 차원 감소 방법)

  • Park, Cheong Hee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.8
    • /
    • pp.361-368
    • /
    • 2016
  • While dimension reduction methods on high dimensional data have been widely studied, research on dimension reduction methods for high dimensional streaming data with concept drift is limited. In this paper, we review incremental dimension reduction methods and propose a method to apply dimension reduction efficiently in order to improve classification performance on high dimensional streaming data with concept drift.

Applications of response dimension reduction in large p-small n problems

  • Minjee Kim;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.191-202
    • /
    • 2024
  • The goal of this paper is to show how multivariate regression analysis with high-dimensional responses is facilitated by the response dimension reduction. Multivariate regression, characterized by multi-dimensional response variables, is increasingly prevalent across diverse fields such as repeated measures, longitudinal studies, and functional data analysis. One of the key challenges in analyzing such data is managing the response dimensions, which can complicate the analysis due to an exponential increase in the number of parameters. Although response dimension reduction methods are developed, there is no practically useful illustration for various types of data such as so-called large p-small n data. This paper aims to fill this gap by showcasing how response dimension reduction can enhance the analysis of high-dimensional response data, thereby providing significant assistance to statistical practitioners and contributing to advancements in multiple scientific domains.

Comparative Study of Dimension Reduction Methods for Highly Imbalanced Overlapping Churn Data

  • Lee, Sujee;Koo, Bonhyo;Jung, Kyu-Hwan
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.4
    • /
    • pp.454-462
    • /
    • 2014
  • Retention of possible churning customer is one of the most important issues in customer relationship management, so companies try to predict churn customers using their large-scale high-dimensional data. This study focuses on dealing with large data sets by reducing the dimensionality. By using six different dimension reduction methods-Principal Component Analysis (PCA), factor analysis (FA), locally linear embedding (LLE), local tangent space alignment (LTSA), locally preserving projections (LPP), and deep auto-encoder-our experiments apply each dimension reduction method to the training data, build a classification model using the mapped data and then measure the performance using hit rate to compare the dimension reduction methods. In the result, PCA shows good performance despite its simplicity, and the deep auto-encoder gives the best overall performance. These results can be explained by the characteristics of the churn prediction data that is highly correlated and overlapped over the classes. We also proposed a simple out-of-sample extension method for the nonlinear dimension reduction methods, LLE and LTSA, utilizing the characteristic of the data.

Crack location in beams by data fusion of fractal dimension features of laser-measured operating deflection shapes

  • Bai, R.B.;Song, X.G.;Radzienski, M.;Cao, M.S.;Ostachowicz, W.;Wang, S.S.
    • Smart Structures and Systems
    • /
    • v.13 no.6
    • /
    • pp.975-991
    • /
    • 2014
  • The objective of this study is to develop a reliable method for locating cracks in a beam using data fusion of fractal dimension features of operating deflection shapes. The Katz's fractal dimension curve of an operating deflection shape is used as a basic feature of damage. Like most available damage features, the Katz's fractal dimension curve has a notable limitation in characterizing damage: it is unresponsive to damage near the nodes of structural deformation responses, e.g., operating deflection shapes. To address this limitation, data fusion of Katz's fractal dimension curves of various operating deflection shapes is used to create a sophisticated fractal damage feature, the 'overall Katz's fractal dimension curve'. This overall Katz's fractal dimension curve has the distinctive capability of overcoming the nodal effect of operating deflection shapes so that it maximizes responsiveness to damage and reliability of damage localization. The method is applied to the detection of damage in numerical and experimental cases of cantilever beams with single/multiple cracks, with high-resolution operating deflection shapes acquired by a scanning laser vibrometer. Results show that the overall Katz's fractal dimension curve can locate single/multiple cracks in beams with significantly improved accuracy and reliability in comparison to the existing method. Data fusion of fractal dimension features of operating deflection shapes provides a viable strategy for identifying damage in beam-type structures, with robustness against node effects.

Comparison of Methods for Reducing the Dimension of Compositional Data with Zero Values

  • Song, Taeg-Youn;Choi, Byung-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.559-569
    • /
    • 2012
  • Compositional data consist of compositions that are non-negative vectors of proportions with the unit-sum constraint. In disciplines such as petrology and archaeometry, it is fundamental to statistically analyze this type of data. Aitchison (1983) introduced a log-contrast principal component analysis that involves logratio transformed data, as a dimension-reduction technique to understand and interpret the structure of compositional data. However, the analysis is not usable when zero values are present in the data. In this paper, we introduce 4 possible methods to reduce the dimension of compositional data with zero values. Two real data sets are analyzed using the methods and the obtained results are compared.

A Classification Method Using Data Reduction

  • Uhm, Daiho;Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2012
  • Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.

Correlation Analysis for Correlation Dimesion of EEG and Cold-heat Score (뇌파의 상관차원과 한열설문지와의 상관분석)

  • Bas, No-Soo;Park, Young-Jae;Oh, Hwan-Sup;Park, Young-Bae
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.11 no.2
    • /
    • pp.116-127
    • /
    • 2007
  • Background and Purpose: Acording to chaos theory, irregular signals of electroencephalogram can interpretated by nonlinear method. Chaotic nonlinear dynamics in EEG can be studied by calculating the correlation dimension. The aim of this study is to analyze EEG by correlation dimension and do Correlation Analysis of correlation dimension and cold-heat score Method: EEG raw data were measured during 15 minutes and choosed 40 seconds. We calculated correlation dimension and used surrogate data method for checking nonlinear data. After then do correlation analysis Result and Conclusion: Correlation dimension of channel 7 and channel 8 are showed significant correlation with cold score.

  • PDF

Correlation over Nonlinear Analysis of EEG and POMS Factor (뇌파와 POMS(Profile of Mood States)의 상관성 연구)

  • Kim, Dong-Won;Park, Young-Bae;Park, Young-Jae;Heo, Young
    • The Journal of the Society of Korean Medicine Diagnostics
    • /
    • v.11 no.2
    • /
    • pp.68-83
    • /
    • 2007
  • Background and Purpose: According to chaos theory, irregular signals of electroencephalogram can interpretated by nonlinear method. Chaotic nonlinear dynamics in EEG can be studied by calculating the correlation dimension. The aim of this study is to analyze EEG by correlation dimension and do Correlation Analysis of correlation dimension and K-POMS factors score. Method: EEG raw data were measured during 15 minutes and choosed 40 seconds. We calculated correlation dimension and used surrogate data method for checking nonlinear data. After then do correlation analysis. Result and Conclusion: Correlation dimension of channel 6, channel 7 and channel 8 are showed significant correlation with vigor factor.

  • PDF

Roundness Modelling by Fractal Interpolation (프랙탈 보간에 의한 진원도 모델링)

  • Yoon, Moon-Chul;Kim, Byung-Tak;Chin, Do-Hun
    • Transactions of the Korean Society of Machine Tool Engineers
    • /
    • v.15 no.3
    • /
    • pp.67-72
    • /
    • 2006
  • There are many modelling methods using theoretical and experimental data. Recently, fractal interpolation methods have been widely used to estimate and analyze various data. Due to the chaotic nature of dynamic roundness profile data in roundness some desirable method must be used for the analysis which is natural to time series data. Fractal analysis used in this paper is within the scope of the fractal interpolation and fractal dimension. Also, two methods for computing the fractal dimension has been introduced which can obtain the dimension of typical dynamic roundness profile data according to the number of data points in which the fixed data are generally lower than 200 data points. This fractal analysis result shows a possible prediction of roundness profile that has some different roundness profile in round shape operation.

Effects on Fractal Dimension by Automobile Driver's EEG during Highway Driving : Based on Chaos Theory (직선 고속 주행시 운전자의 뇌파가 프랙탈 차원에 미치는 영향: 카오스 이론을 중심으로)

  • 이돈규;김정룡
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.23 no.57
    • /
    • pp.51-62
    • /
    • 2000
  • In this study, the psycho-physiological response of drivers was investigated in terms of EEG(Electroencephalogram), especially with the fractal dimensions computed by Chaotic algorithm. The Chaotic algorithm Is well Known to sensitively analyze the non-linear information such as brain waves. An automobile with a fully equipped data acquisition system was used to collect the data. Ten healthy subjects participated in the experiment. EEG data were collected while subjects were driving the car between Won-ju and Shin-gal J.C. on Young-Dong highway The results were presented in terms of 3-Dimensional attractor to confirm the chaotic nature of the EEG data. The correlation dimension and fractal dimension were calculated to evaluate the complexity of the brain activity as the driving duration changes. In particular, the fractal dimension indicated a difference between the driving condition and non-driving condition while other spectral variables showed inconsistent results. Based upon the fractal dimension, drivers processed the most information at the beginning of the highway driving and the amount of brain activity gradually decreased and stabilized. No particular decrease of brain activity was observed even after 100 km driving. Considering the sensitivity and consistency of the analysis by Chaotic algorithm, the fractal dimension can be a useful parameter to evaluate the psycho-physiological responses of human brain at various driving conditions.

  • PDF