• Title/Summary/Keyword: multi-regression statistics

Search Result 114, Processing Time 0.031 seconds

Fused inverse regression with multi-dimensional responses

  • Cho, Youyoung;Han, Hyoseon;Yoo, Jae Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.3
    • /
    • pp.267-279
    • /
    • 2021
  • A regression with multi-dimensional responses is quite common nowadays in the so-called big data era. In such regression, to relieve the curse of dimension due to high-dimension of responses, the dimension reduction of predictors is essential in analysis. Sufficient dimension reduction provides effective tools for the reduction, but there are few sufficient dimension reduction methodologies for multivariate regression. To fill this gap, we newly propose two fused slice-based inverse regression methods. The proposed approaches are robust to the numbers of clusters or slices and improve the estimation results over existing methods by fusing many kernel matrices. Numerical studies are presented and are compared with existing methods. Real data analysis confirms practical usefulness of the proposed methods.

Firework plot for evaluating the impact of influential observations in multi-response surface methodology (다반응 반응표면분석에서 특이값의 영향을 평가하기 위한 불꽃그림)

  • Kim, Sang Ik;Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.1
    • /
    • pp.97-108
    • /
    • 2018
  • It has been routine practice in regression analysis to check the validity of the assumed model by the use of regression diagnostics tools. Outliers and influential observations often distort the regression output in an undesired manner. Jang and Anderson-Cook (Quality and Reliability Engineering International, 30, 1409-1425, 2014) proposed a graphical method (called a firework plot) so that there could be an exploratory visualization of the trace of the impact of the possible outliers and influential observations on individual regression coefficients and the overall residual sum of the squares measure. This paper further extends a graphical approach to a multi-response surface methodology problem.

MP-Lasso chart: a multi-level polar chart for visualizing group Lasso analysis of genomic data

  • Min Song;Minhyuk Lee;Taesung Park;Mira Park
    • Genomics & Informatics
    • /
    • v.20 no.4
    • /
    • pp.48.1-48.7
    • /
    • 2022
  • Penalized regression has been widely used in genome-wide association studies for joint analyses to find genetic associations. Among penalized regression models, the least absolute shrinkage and selection operator (Lasso) method effectively removes some coefficients from the model by shrinking them to zero. To handle group structures, such as genes and pathways, several modified Lasso penalties have been proposed, including group Lasso and sparse group Lasso. Group Lasso ensures sparsity at the level of pre-defined groups, eliminating unimportant groups. Sparse group Lasso performs group selection as in group Lasso, but also performs individual selection as in Lasso. While these sparse methods are useful in high-dimensional genetic studies, interpreting the results with many groups and coefficients is not straightforward. Lasso's results are often expressed as trace plots of regression coefficients. However, few studies have explored the systematic visualization of group information. In this study, we propose a multi-level polar Lasso (MP-Lasso) chart, which can effectively represent the results from group Lasso and sparse group Lasso analyses. An R package to draw MP-Lasso charts was developed. Through a real-world genetic data application, we demonstrated that our MP-Lasso chart package effectively visualizes the results of Lasso, group Lasso, and sparse group Lasso.

Multi-variate Fuzzy Polynomial Regression using Shape Preserving Operations

  • Hong, Dug-Hun;Do, Hae-Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.1
    • /
    • pp.131-141
    • /
    • 2003
  • In this paper, we prove that multi-variate fuzzy polynomials are universal approximators for multi-variate fuzzy functions which are the extension principle of continuous real-valued function under $T_W-based$ fuzzy arithmetic operations for a distance measure that Buckley et al.(1999) used. We also consider a class of fuzzy polynomial regression model. A mixed non-linear programming approach is used to derive the satisfying solution.

  • PDF

Variable Selection with Regression Trees

  • Chang, Young-Jae
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.357-366
    • /
    • 2010
  • Many tree algorithms have been developed for regression problems. Although they are regarded as good algorithms, most of them suffer from loss of prediction accuracy when there are many noise variables. To handle this problem, we propose the multi-step GUIDE, which is a regression tree algorithm with a variable selection process. The multi-step GUIDE performs better than some of the well-known algorithms such as Random Forest and MARS. The results based on simulation study shows that the multi-step GUIDE outperforms other algorithms in terms of variable selection and prediction accuracy. It generally selects the important variables correctly with relatively few noise variables and eventually gives good prediction accuracy.

Applications of response dimension reduction in large p-small n problems

  • Minjee Kim;Jae Keun Yoo
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.2
    • /
    • pp.191-202
    • /
    • 2024
  • The goal of this paper is to show how multivariate regression analysis with high-dimensional responses is facilitated by the response dimension reduction. Multivariate regression, characterized by multi-dimensional response variables, is increasingly prevalent across diverse fields such as repeated measures, longitudinal studies, and functional data analysis. One of the key challenges in analyzing such data is managing the response dimensions, which can complicate the analysis due to an exponential increase in the number of parameters. Although response dimension reduction methods are developed, there is no practically useful illustration for various types of data such as so-called large p-small n data. This paper aims to fill this gap by showcasing how response dimension reduction can enhance the analysis of high-dimensional response data, thereby providing significant assistance to statistical practitioners and contributing to advancements in multiple scientific domains.

Scaling MDS for Preference Data Using Target Configuration

  • Hwang, S.Y.;Park, S.K.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.14 no.2
    • /
    • pp.237-245
    • /
    • 2003
  • MDS(multi-dimensional scaling) for preference data is a graphical tool which usually figures out how consumers recognize, evaluate certain products. This article is mainly concerned with an optimal scaling for MDS when target configuration is available. Rotation of axis and SUR(seemingly unrelated regression) methods are employed to get a new configuration which is obtained as close to the target as we can. Methodologies developed here are also illustrated via a real data set.

  • PDF

Regression models for interval-censored semi-competing risks data with missing intermediate transition status (중간 사건이 결측되었거나 구간 중도절단된 준 경쟁 위험 자료에 대한 회귀모형)

  • Kim, Jinheum;Kim, Jayoun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1311-1327
    • /
    • 2016
  • We propose a multi-state model for analyzing semi-competing risks data with interval-censored or missing intermediate events. This model is an extension of the 'illness-death model', which composes three states, such as 'healthy', 'diseased', and 'dead'. The state of 'diseased' can be considered as an intermediate event. Two more states are added into the illness-death model to describe missing events caused by a loss of follow-up before the end of the study. One of them is a state of 'LTF', representing a lost-to-follow-up, and the other is an unobservable state that represents the intermediate event experienced after LTF occurred. Given covariates, we employ the Cox proportional hazards model with a normal frailty and construct a full likelihood to estimate transition intensities between states in the multi-state model. Marginalization of the full likelihood is completed using the adaptive Gaussian quadrature, and the optimal solution of the regression parameters is achieved through the iterative Newton-Raphson algorithm. Simulation studies are carried out to investigate the finite-sample performance of the proposed estimation procedure in terms of the empirical coverage probability of the true regression parameter. Our proposed method is also illustrated with the dataset adapted from Helmer et al. (2001).

Impact of Regional Emergency Medical Access on Patients' Prognosis and Emergency Medical Expenditure (지역별 응급의료 접근성이 환자의 예후 및 응급의료비 지출에 미치는 영향)

  • Kim, Yeonjin;Lee, Tae-Jin
    • Health Policy and Management
    • /
    • v.30 no.3
    • /
    • pp.399-408
    • /
    • 2020
  • Background: The purpose of this study was to examine the impact of the regional characteristics on the accessibility of emergency care and the impact of emergency medical accessibility on the patients' prognosis and the emergency medical expenditure. Methods: This study used the 13th beta version 1.6 annual data of Korea Health Panel and the statistics from the Korean Statistical Information Service. The sample included 8,119 patients who visited the emergency centers between year 2013 and 2017. The arrival time, which indicated medical access, was used as dependent variable for multi-level analysis. For ordinal logistic regression and multiple regression, the arrival time was used as independent variable while patients' prognosis and emergency medical expenditure were used as dependent variables. Results: The results for the multi-level analysis in both the individual and regional variables showed that as the number of emergency medical institutions per 100 km2 area increased, the time required to reach emergency centers significantly decreased. Ordinal logistic regression and multiple regression results showed that as the arrival time increased, the patients' prognosis significantly worsened and the emergency medical expenses significantly increased. Conclusion: In conclusion, the access to emergency care was affected by regional characteristics and affected patient outcomes and emergency medical expenditure.

Development of Daily Hassles Scale for Children in Korea (한국아동의 일상적 스트레스 척도의 개발)

  • 한미현
    • Journal of the Korean Home Economics Association
    • /
    • v.33 no.4
    • /
    • pp.49-64
    • /
    • 1995
  • The purpose of this study was to develop the Daily Hassles Scale for children in Korea. The subject were 444 children of 184 fourth graders and 260 sixth graders selected form five elementary schools in Seoul(217 male and 227 female). A questionnaire consisting of 90-item daily hassles scale, demographic questions, and some additional questions was used as a methodological instrument. statistics used for data analysis were X2, cramer's V, factor analysis, multi-regression, Pearson's r, Cronbach's α. The major findings of this study were as follows. 1) 87 items of the 90-item scale were acceptible through item discriminant method. The discriminant coefficients of the items(Cramer's V) ranged form .28 to .73. 2) 6 factors(parents, home environment, friends, studies, teachers & school, the surroundings) were extracted from factor analysis. Multi-regression analysis conducted to reduce the length of scale have drawed 42 items for 'the Daily Hassles Scale for Children in Korea'. The correlation between this scale and the Quality of Life Scale(Olson & Barnes, 1982) was conducted to test the criterion-related validity, and the coefficient was significant(r=-.52, p<.001).3) Finally, reliability coefficients(Cronbach'α) of this scale was. 85.

  • PDF