• Title/Summary/Keyword: multivariate data

Search Result 1,967, Processing Time 0.023 seconds

Fault Detection Method for Multivariate Process using Mahalanobis Distance and ICA (마할라노비스 거리와 독립성분분석을 이용한 다변량 공정 고장탐지 방법에 관한 연구)

  • Jung, Seunghwan;Kim, Sungshin
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.14 no.1
    • /
    • pp.22-28
    • /
    • 2021
  • Multivariate processes, such as chemical and mechanical process, power plants are operated in a state where several facilities are complexly connected, the fault of a particular system can also have fatal consequences for the entire process. In addition, since process data is measured in an unstable environment, outlier is likely to be include in the data. Therefore, monitoring technology is essential, which can remove outlier from measured data and detect failures in advance. In this paper, data obtained from dynamic and multivariate process models was used to detect fault in various type of processes. The dynamic process is a simulation of a process with autoregressive property, and the multivariate process is a model that describes a situation when a specific sensor fault. Mahalanobis distance was used to remove outlier contained in the data generated by dynamic process model and multivariate process model, and fault detection was performed using ICA. For comparison, we compared performance with and a conventional single ICA method. The proposed fault detection method improves performance by 0.84%p for bias data and 6.82%p for drift data in the dynamic process. In the case of the multivariate process, the performance was improves by 3.78%p, therefore, the proposed method showed better fault detection performance.

Multivariate control charts for monitoring correlation coefficients in dispersion matrix

  • Chang, Duk-Joon;Heo, Sun-Yeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.1037-1044
    • /
    • 2012
  • Multivariate control charts for effectively monitoring every component in the dispersion matrix of multivariate normal process are considered. Through the numerical results, we noticed that the multivariate control charts based on sample statistic $V_i$ by Hotelling or $W_i$ by Alt do not work effectively when the correlation coefficient components in dispersion matrix are increased. We propose a combined procedure monitoring every component of dispersion matrix, which operates simultaneously both control charts, a chart controlling variance components and a chart controlling correlation coefficients. Our numerical results show that the proposed combined procedure is efficient for detecting changes in both variances and correlation coefficients of dispersion matrix.

Multivariate Cumulative Sum Control Chart for Dispersion Matrix

  • Chang, Duk-Joon;Shin, Jae-Kyoung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.21-29
    • /
    • 2002
  • Several different control statistics to simultaneously monitor dispersion matrix of several quality variables are presented since different control statistics can be used to describe variability. Multivariare cumulative sum (CUSUM) control charts are proposed and the performances of the proposed CUSUM charts are evaluated in terms of average run length (ARL). Multivariate Shewhart charts are also proposed to compare the properties of the proposed CUSUM charts. The numerical results show that multivariate CUSUM charts are more efficient than multivariate Shewhart charts for small or moderate shifts. And we also found that small reference value of the CUSUM chart is more efficient for small shift.

  • PDF

Relevance of Multivariate Analysis in Management Research

  • Ojha, Sateesh Kumar
    • Journal of Information Technology Applications and Management
    • /
    • v.23 no.3
    • /
    • pp.25-34
    • /
    • 2016
  • Often we receive misled conclusion in the research if properly variables are not analyzed. In different functional issues of management it is very essential that all the latent and observed variable are properly understood so management decisions will be relevant and effective. The objective of this paper is to investigate the use of different multivariate tools for analyzing in the management research : applied or basic. The sources of data is primary as well as secondary. The primary includes the observation of different research articles of the proceedings of different conferences. And the secondary includes different publications related to multivariate analysis. The study has revealed the reasons of not using such tools of research. The preliminary finding reveals that most of the researches do not use such analytical tools in a comprehensive manner. Carelessness in design while fixing the design aspect is the main reasons of not using appropriate design.

A Development of Multivariate Analysis System by Using Excel (EXCEL을 이용한 다변량자료분석 시스템 개발)

  • 한상태;강현철;한정훈
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.1
    • /
    • pp.165-172
    • /
    • 2004
  • Recently, there have been several studies to develop the multivariate data analysis system which can be readily used. The common characteristic of these studies is to develop the GUI system to which advanced statistical methods can be conveniently applied. In an extension of these studies, this study aims to supply users in various fields an interactive system with the convenience of the environment of GUI, which is constructed with the Excel macro and VBA, to apply multivariate data analysis methods easily. This system provides a graphic-oriented and menu-centered user interface in the Microsoft Excel which is widely used spreadsheet and analysis program.

A fast approximate fitting for mixture of multivariate skew t-distribution via EM algorithm

  • Kim, Seung-Gu
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.255-268
    • /
    • 2020
  • A mixture of multivariate canonical fundamental skew t-distribution (CFUST) has been of interest in various fields. In particular, interest in the unsupervised learning society is noteworthy. However, fitting the model via EM algorithm suffers from significant processing time. The main cause is due to the calculation of many multivariate t-cdfs (cumulative distribution functions) in E-step. In this article, we provide an approximate, but fast calculation method for the in univariate fashion, which is the product of successively conditional univariate t-cdfs with Taylor's first order approximation. By replacing all multivariate t-cdfs in E-step with the proposed approximate versions, we obtain the admissible results of fitting the model, where it gives 85% reduction time for the 5 dimensional skewness case of the Australian Institution Sport data set. For this approach, discussions about rough properties, advantages and limits are also presented.

Detecting Influential Observations in Multivariate Statistical Analysis of Incomplete Data by PCA (주성분분석에 의한 결손 자료의 영향값 검출에 대한 연구)

  • 김현정;문승호;신재경
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.383-392
    • /
    • 2000
  • Since late 1970, methods of influence or sensitivity analysis for detecting influential observations have been studied not only in regression and related methods but also in various multivariate methods. If results of multivariate analyses sometimes depend heavily on a small number of observations, we should be very careful to draw a conclusion. Similar phenomena may also occur in the case of incomplete data. In this research we try to study such influential observations in multivariate statistical analysis of incomplete data. Case of principal component analysis is studied with a numerical example.

  • PDF

GEOSTATISTICAL INTEGRATION OF HIGH-RESOLUTION REMOTE SENSING DATA IN SPATIAL ESTIMATION OF GRAIN SIZE

  • Park, No-Wook;Chi, Kwang-Hoon;Jang, Dong-Ho
    • Proceedings of the KSRS Conference
    • /
    • v.1
    • /
    • pp.406-408
    • /
    • 2006
  • Various geological thematic maps such as grain size or ground water level maps have been generated by interpolating sparsely sampled ground survey data. When there are sampled data at a limited number of locations, to use secondary information which is correlated to primary variable can help us to estimate the attribute values of the primary variable at unsampled locations. This paper applies two multivariate geostatistical algorithms to integrate remote sensing imagery with sparsely sampled ground survey data for spatial estimation of grain size: simple kriging with local means and kriging with an external drift. High-resolution IKONOS imagery which is well correlated with the grain size is used as secondary information. The algorithms are evaluated from a case study with grain size observations measured at 53 locations in the Baramarae beach of Anmyeondo, Korea. Cross validation based on a one-leave-out approach is used to compare the estimation performance of the two multivariate geostatistical algorithms with that of traditional ordinary kriging.

  • PDF

Multivariate Nonparametric Tests for Grouped and Right Censored Data

  • Park Hyo-Il;Na Jong-Hwa;Hong Seungman
    • International Journal of Reliability and Applications
    • /
    • v.6 no.1
    • /
    • pp.53-64
    • /
    • 2005
  • In this paper, we propose a nonparametric test procedure for the multivariate, grouped and right censored data for two sample problem. For the construction of the test statistic, we use the linear rank statistics for each component and apply the permutation principle for obtaining the null distribution. For the large sample case, the asymptotic distribution is derived under the null hypothesis with the additional assumption that two censoring distributions are also equal. Finally, we illustrate our procedure with an example and discuss some concluding remarks. In appendices, we derive the expression of the covariance matrix and prove the asymptotic distribution.

  • PDF

A Resetting Scheme for Process Parameters using the Mahalanobis-Taguchi System

  • Park, Chang-Soon
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.4
    • /
    • pp.589-603
    • /
    • 2012
  • Mahalanobis-Taguchi system(MTS) is a statistical tool for classifying the normal group and abnormal group in multivariate data structures. In addition to the classification itself, the MTS uses a method for selecting variables useful for the classification. This method can be used efficiently especially when the abnormal group data are scattered without a specific directionality. When the feedback adjustment procedure through the measurements of the process output for controlling process input variables is not practically possible, the reset procedure can be an alternative one. This article proposes a reset procedure using the MTS. Moreover, a method for identifying input variables to reset is also proposed by the use of the contribution. The identification of the root-cause parameters using the existing dimension-reduced contribution tends to be difficult due to the variety of correlation relationships of multivariate data structures. However, it became possible to provide an improved decision when used together with the location-centered contribution and the individual-parameter contribution.