• Title/Summary/Keyword: Multivariate Data

Search Result 2,016, Processing Time 0.026 seconds

DD-plot for Detecting the Out-of-Control State in Multivariate Process (다변량공정에서 이상상태를 탐지하기 위한 DD-plot)

  • Jang, Dae-Heung;Yi, Seongbaek;Kim, Youngil
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.2
    • /
    • pp.281-290
    • /
    • 2013
  • It is well known that the DD-plot is a useful graphical tool for non-parametric classification. In this paper, we propose another use of DD-plot for detecting the out-of-control state in multivariate process. We suggested a dynamic version of DD-plot and its accompanying a quality index plot in such case.

Nonlinear structural modeling using multivariate adaptive regression splines

  • Zhang, Wengang;Goh, A.T.C.
    • Computers and Concrete
    • /
    • v.16 no.4
    • /
    • pp.569-585
    • /
    • 2015
  • Various computational tools are available for modeling highly nonlinear structural engineering problems that lack a precise analytical theory or understanding of the phenomena involved. This paper adopts a fairly simple nonparametric adaptive regression algorithm known as multivariate adaptive regression splines (MARS) to model the nonlinear interactions between variables. The MARS method makes no specific assumptions about the underlying functional relationship between the input variables and the response. Details of MARS methodology and its associated procedures are introduced first, followed by a number of examples including three practical structural engineering problems. These examples indicate that accuracy of the MARS prediction approach. Additionally, MARS is able to assess the relative importance of the designed variables. As MARS explicitly defines the intervals for the input variables, the model enables engineers to have an insight and understanding of where significant changes in the data may occur. An example is also presented to demonstrate how the MARS developed model can be used to carry out structural reliability analysis.

A rolling analysis on the prediction of value at risk with multivariate GARCH and copula

  • Bai, Yang;Dang, Yibo;Park, Cheolwoo;Lee, Taewook
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.6
    • /
    • pp.605-618
    • /
    • 2018
  • Risk management has been a crucial part of the daily operations of the financial industry over the past two decades. Value at Risk (VaR), a quantitative measure introduced by JP Morgan in 1995, is the most popular and simplest quantitative measure of risk. VaR has been widely applied to the risk evaluation over all types of financial activities, including portfolio management and asset allocation. This paper uses the implementations of multivariate GARCH models and copula methods to illustrate the performance of a one-day-ahead VaR prediction modeling process for high-dimensional portfolios. Many factors, such as the interaction among included assets, are included in the modeling process. Additionally, empirical data analyses and backtesting results are demonstrated through a rolling analysis, which help capture the instability of parameter estimates. We find that our way of modeling is relatively robust and flexible.

Quantile confidence region using highest density

  • Hong, Chong Sun;Yoo, Myung Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.26 no.1
    • /
    • pp.35-46
    • /
    • 2019
  • Multivariate Confidence Region (MCR) cannot be used to obtain the confidence region of the mean vector of multivariate data when the normality assumption is not satisfied; however, the Quantile Confidence Region (QCR) could be used with a Multivariate Quantile Vector in these cases. The coverage rate of the QCR is better than MCR; however, it has a disadvantage because the QCR has a wide shape when the probability density function follows a bimodal form. In this study, we propose a Quantile Confidence Region using the Highest density (QCRHD) method with the Highest Density Region (HDR). The coverage rate of QCRHD was superior to MCR, but is found to be similar to QCR. The QCRHD is constructed as one region similar to QCR when the distance of the mean vector is close. When the distance of the mean vector is far, the QCR has one wide region, but the QCRHD has two smaller regions. Based on these features, it is found that the QCRHD can overcome the disadvantages of the QCR, which may have a wide shape.

Matrix Formation in Univariate and Multivariate General Linear Models

  • Arwa A. Alkhalaf
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.4
    • /
    • pp.44-50
    • /
    • 2024
  • This paper offers an overview of matrix formation and calculation techniques within the framework of General Linear Models (GLMs). It takes a sequential approach, beginning with a detailed exploration of matrix formation and calculation methods in regression analysis and univariate analysis of variance (ANOVA). Subsequently, it extends the discussion to cover multivariate analysis of variance (MANOVA). The primary objective of this study was to provide a clear and accessible explanation of the underlying matrices that play a crucial role in GLMs. Through linking, essentially different statistical methods, by fundamental principles and algebraic foundations that underpin the GLM estimation. Insights presented here aim to assist researchers, statisticians, and data analysts in enhancing their understanding of GLMs and their practical implementation in diverse research domains. This paper contributes to a better comprehension of the matrix-based techniques that can be extended to GLMs.

An Alternating Approach of Maximum Likelihood Estimation for Mixture of Multivariate Skew t-Distribution (치우친 다변량 t-분포 혼합모형에 대한 최우추정)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.5
    • /
    • pp.819-831
    • /
    • 2014
  • The Exact-EM algorithm can conventionally fit a mixture of multivariate skew distribution. However, it suffers from highly expensive computational costs to calculate the moments of multivariate truncated t-distribution in E-step. This paper proposes a new SPU-EM method that adopts the AECM algorithm principle proposed by Meng and van Dyk (1997)'s to circumvent the multi-dimensionality of the moments. This method offers a shorter execution time than a conventional Exact-EM algorithm. Some experments are provided to show its effectiveness.

On the Application of Multivariate Kendall's Tau and Its Interpretation (다차원 캔달의 타우의 통계학적 응용과 그의 해석)

  • Lee, Woojoo;Ahn, Jae Youn
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.3
    • /
    • pp.495-509
    • /
    • 2013
  • We study multivariate extension of Kendall's tau and its statistical interpretation. There exist various versions of multivariate Kendall's tau, for example Scarsini (1984), Joe (1990) and Genest et al. (2011); however, few of them mention its lower bounds. For the bivariate case, the Fr$\acute{e}$chet-Hoeffding lower bound can achieve the lower bound of Kendall's tau. However in the multivariate case, the Fr$\acute{e}$chet-Hoeffding lower bound itself does not exist as a distribution, which makes the interpretation of Kendall's tau unclear when it has negative value. In this paper, we explain sufficient conditions to achieve the lower bound of Kendall's tau and provide real data examples that provide further insights into the interpretation for the lower bounds of Kendall's tau.

Multivariate Statistical Analysis and Prediction for the Flash Points of Binary Systems Using Physical Properties of Pure Substances (순수 성분의 물성 자료를 이용한 2성분계 혼합물의 인화점에 대한 다변량 통계 분석 및 예측)

  • Lee, Bom-Sock;Kim, Sung-Young
    • Journal of the Korean Institute of Gas
    • /
    • v.11 no.3
    • /
    • pp.13-18
    • /
    • 2007
  • The multivariate statistical analysis, using the multiple linear regression(MLR), have been applied to analyze and predict the flash points of binary systems. Prediction for the flash points of flammable substances is important for the examination of the fire and explosion hazards in the chemical process design. In this paper, the flash points are predicted by MLR based on the physical properties of pure substances and the experimental flash points data. The results of regression and prediction by MLR are compared with the values calculated by Raoult's law and Van Laar equation.

  • PDF

Multivariate process control procedure using a decision tree learning technique (의사결정나무를 이용한 다변량 공정관리 절차)

  • Jung, Kwang Young;Lee, Jaeheon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.3
    • /
    • pp.639-652
    • /
    • 2015
  • In today's manufacturing environment, the process data can be easily measured and transferred to a computer for analysis in a real-time mode. As a result, it is possible to monitor several correlated quality variables simultaneously. Various multivariate statistical process control (MSPC) procedures have been presented to detect an out-of-control event. Although the classical MSPC procedures give the out-of-control signal, it is difficult to determine which variable has caused the signal. In order to solve this problem, data mining and machine learning techniques can be considered. In this paper, we applied the technique of decision tree learning to the MSPC, and we did simulation for MSPC procedures to monitor the bivariate normal process means. The results of simulation show that the overall performance of the MSPC procedure using decision tree learning technique is similar for several values of correlation coefficient, and the accurate classification rates for out-of-control are different depending on the values of correlation coefficient and the shift magnitude. The introduced procedure has the advantage that it provides the information about assignable causes, which can be required by practitioners.

Comparison study of modeling covariance matrix for multivariate longitudinal data (다변량 경시적 자료 분석을 위한 공분산 행렬의 모형화 비교 연구)

  • Kwak, Na Young;Lee, Keunbaik
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.281-296
    • /
    • 2020
  • Repeated outcomes from the same subjects are referred to as longitudinal data. Analysis of the data requires different methods unlike cross-sectional data analysis. It is important to model the covariance matrix because the correlation between the repeated outcomes must be considered when estimating the effects of covariates on the mean response. However, the modeling of the covariance matrix is tricky because there are many parameters to be estimated, and the estimated covariance matrix should be positive definite. In this paper, we consider analysis of multivariate longitudinal data via two modeling methodologies for the covariance matrix for multivariate longitudinal data. Both methods describe serial correlations of multivariate longitudinal outcomes using a modified Cholesky decomposition. However, the two methods consider different decompositions to explain the correlation between simultaneous responses. The first method uses enhanced linear covariance models so that the covariance matrix satisfies a positive definiteness condition; in addition, and principal component analysis and maximization-minimization algorithm (MM algorithm) were used to estimate model parameters. The second method considers variance-correlation decomposition and hypersphere decomposition to model covariance matrix. Simulations are used to compare the performance of the two methodologies.