• Title/Summary/Keyword: sliced inverse regression (SIR)

Search Result 10, Processing Time 0.017 seconds

Fused sliced inverse regression in survival analysis

  • Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.5
    • /
    • pp.533-541
    • /
    • 2017
  • Sufficient dimension reduction (SDR) replaces original p-dimensional predictors to a lower-dimensional linearly transformed predictor. The sliced inverse regression (SIR) has the longest and most popular history of SDR methodologies. The critical weakness of SIR is its known sensitive to the numbers of slices. Recently, a fused sliced inverse regression is developed to overcome this deficit, which combines SIR kernel matrices constructed from various choices of the number of slices. In this paper, the fused sliced inverse regression and SIR are compared to show that the former has a practical advantage in survival regression over the latter. Numerical studies confirm this and real data example is presented.

On robustness in dimension determination in fused sliced inverse regression

  • Yoo, Jae Keun;Cho, Yoo Na
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.513-521
    • /
    • 2018
  • The goal of sufficient dimension reduction (SDR) is to replace original p-dimensional predictors with a lower-dimensional linearly transformed predictor. The sliced inverse regression (SIR) (Li, Journal of the American Statistical Association, 86, 316-342, 1991) is one of the most popular SDR methods because of its applicability and simple implementation in practice. However, SIR may yield different dimension reduction results for different numbers of slices and despite its popularity, is a clear deficit for SIR. To overcome this, a fused sliced inverse regression was recently proposed. The study shows that the dimension-reduced predictors is robust to the numbers of the slices, but it does not investigate how robust its dimension determination is. This paper suggests a permutation dimension determination for the fused sliced inverse regression that is compared with SIR to investigate the robustness to the numbers of slices in the dimension determination. Numerical studies confirm this and a real data example is presented.

Variable Selection in Sliced Inverse Regression Using Generalized Eigenvalue Problem with Penalties

  • Park, Chong-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.215-227
    • /
    • 2007
  • Variable selection algorithm for Sliced Inverse Regression using penalty function is proposed. We noted SIR models can be expressed as generalized eigenvalue decompositions and incorporated penalty functions on them. We found from small simulation that the HARD penalty function seems to be the best in preserving original directions compared with other well-known penalty functions. Also it turned out to be effective in forcing coefficient estimates zero for irrelevant predictors in regression analysis. Results from illustrative examples of simulated and real data sets will be provided.

An Empirical Study on Dimension Reduction

  • Suh, Changhee;Lee, Hakbae
    • Journal of the Korean Data Analysis Society
    • /
    • v.20 no.6
    • /
    • pp.2733-2746
    • /
    • 2018
  • The two inverse regression estimation methods, SIR and SAVE to estimate the central space are computationally easy and are widely used. However, SIR and SAVE may have poor performance in finite samples and need strong assumptions (linearity and/or constant covariance conditions) on predictors. The two non-parametric estimation methods, MAVE and dMAVE have much better performance for finite samples than SIR and SAVE. MAVE and dMAVE need no strong requirements on predictors or on the response variable. MAVE is focused on estimating the central mean subspace, but dMAVE is to estimate the central space. This paper explores and compares four methods to explain the dimension reduction. Each algorithm of these four methods is reviewed. Empirical study for simulated data shows that MAVE and dMAVE has relatively better performance than SIR and SAVE, regardless of not only different models but also different distributional assumptions of predictors. However, real data example with the binary response demonstrates that SAVE is better than other methods.

Generalization of Fisher′s linear discriminant analysis via the approach of sliced inverse regression

  • Chen, Chun-Houh;Li, Ker-Chau
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.193-217
    • /
    • 2001
  • Despite of the rich literature in discriminant analysis, this complicated subject remains much to be explored. In this article, we study the theoretical foundation that supports Fisher's linear discriminant analysis (LDA) by setting up the classification problem under the dimension reduction framework as in Li(1991) for introducing sliced inverse regression(SIR). Through the connection between SIR and LDA, our theory helps identify sources of strength and weakness in using CRIMCOORDS(Gnanadesikan 1977) as a graphical tool for displaying group separation patterns. This connection also leads to several ways of generalizing LDA for better exploration and exploitation of nonlinear data patterns.

  • PDF

Nonparametric test on dimensionality of explantory variables (설명변수 차원 축소에 관한 비모수적 검정)

  • 서한손
    • The Korean Journal of Applied Statistics
    • /
    • v.8 no.2
    • /
    • pp.65-75
    • /
    • 1995
  • For the determination of dimension of e.d.r. space, both of Sliced Inverse Regression (SIR) and Principal Hessian Directions (PHD) proposed asymptotic test. But the asymptotic test requires the normality and large samples of explanatory variables. Cook and Weisberg(1991) suggested permutation tests instead. In this study permutation tests are actually made, and the power of them is compared with asymptotic test in the case of SIR and PHD.

  • PDF

Asymptotic Test for Dimensionality in Sliced Inverse Regression (분할 역회귀모형에서 차원결정을 위한 점근검정법)

  • Park, Chang-Sun;Kwak, Jae-Guen
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.381-393
    • /
    • 2005
  • As a promising technique for dimension reduction in regression analysis, Sliced Inverse Regression (SIR) and an associated chi-square test for dimensionality were introduced by Li (1991). However, Li's test needs assumption of Normality for predictors and found to be heavily dependent on the number of slices. We will provide a unified asymptotic test for determining the dimensionality of the SIR model which is based on the probabilistic principal component analysis and free of normality assumption on predictors. Illustrative results with simulated and real examples will also be provided.

Dimension reduction for right-censored survival regression: transformation approach

  • Yoo, Jae Keun;Kim, Sung-Jin;Seo, Bi-Seul;Shin, Hyejung;Sim, Su-Ah
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.3
    • /
    • pp.259-268
    • /
    • 2016
  • High-dimensional survival data with large numbers of predictors has become more common. The analysis of such data can be facilitated if the dimensions of predictors are adequately reduced. Recent studies show that a method called sliced inverse regression (SIR) is an effective dimension reduction tool in high-dimensional survival regression. However, it faces incapability in implementation due to a double categorization procedure. This problem can be overcome in the right-censoring type by transforming the observed survival time and censoring status into a single variable. This provides more flexibility in the categorization, so the applicability of SIR can be enhanced. Numerical studies show that the proposed transforming approach is equally good to (or even better) than the usual SIR application in both balanced and highly-unbalanced censoring status. The real data example also confirms its practical usefulness, so the proposed approach should be an effective and valuable addition to usual statistical practitioners.

A Short Note on Empirical Penalty Term Study of BIC in K-means Clustering Inverse Regression

  • Ahn, Ji-Hyun;Yoo, Jae-Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.3
    • /
    • pp.267-275
    • /
    • 2011
  • According to recent studies, Bayesian information criteria(BIC) is proposed to determine the structural dimension of the central subspace through sliced inverse regression(SIR) with high-dimensional predictors. The BIC may be useful in K-means clustering inverse regression(KIR) with high-dimensional predictors. However, the direct application of the BIC to KIR may be problematic, because the slicing scheme in SIR is not the same as that of KIR. In this paper, we present empirical penalty term studies of BIC in KIR to identify the most appropriate one. Numerical studies and real data analysis are presented.

DR-LSTM: Dimension reduction based deep learning approach to predict stock price

  • Ah-ram Lee;Jae Youn Ahn;Ji Eun Choi;Kyongwon Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.213-234
    • /
    • 2024
  • In recent decades, increasing research attention has been directed toward predicting the price of stocks in financial markets using deep learning methods. For instance, recurrent neural network (RNN) is known to be competitive for datasets with time-series data. Long short term memory (LSTM) further improves RNN by providing an alternative approach to the gradient loss problem. LSTM has its own advantage in predictive accuracy by retaining memory for a longer time. In this paper, we combine both supervised and unsupervised dimension reduction methods with LSTM to enhance the forecasting performance and refer to this as a dimension reduction based LSTM (DR-LSTM) approach. For a supervised dimension reduction method, we use methods such as sliced inverse regression (SIR), sparse SIR, and kernel SIR. Furthermore, principal component analysis (PCA), sparse PCA, and kernel PCA are used as unsupervised dimension reduction methods. Using datasets of real stock market index (S&P 500, STOXX Europe 600, and KOSPI), we present a comparative study on predictive accuracy between six DR-LSTM methods and time series modeling.