• 제목/요약/키워드: Two-stage regression

검색결과 305건 처리시간 0.022초

Two-stage imputation method to handle missing data for categorical response variable

  • Jong-Min Kim;Kee-Jae Lee;Seung-Joo Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제30권6호
    • /
    • pp.577-587
    • /
    • 2023
  • Conventional categorical data imputation techniques, such as mode imputation, often encounter issues related to overestimation. If the variable has too many categories, multinomial logistic regression imputation method may be impossible due to computational limitations. To rectify these limitations, we propose a two-stage imputation method. During the first stage, we utilize the Boruta variable selection method on the complete dataset to identify significant variables for the target categorical variable. Then, in the second stage, we use the important variables for the target categorical variable for logistic regression to impute missing data in binary variables, polytomous regression to impute missing data in categorical variables, and predictive mean matching to impute missing data in quantitative variables. Through analysis of both asymmetric and non-normal simulated and real data, we demonstrate that the two-stage imputation method outperforms imputation methods lacking variable selection, as evidenced by accuracy measures. During the analysis of real survey data, we also demonstrate that our suggested two-stage imputation method surpasses the current imputation approach in terms of accuracy.

Confidence Intervals on Variance Components in Two Stage Regression Model

  • Park, Dong-Joon
    • Communications for Statistical Applications and Methods
    • /
    • 제3권2호
    • /
    • pp.29-36
    • /
    • 1996
  • In regression model with nested error structure interval estimations about variability on different stages are proposed. This article derives an approximate confidence interval on the variance in the first stage and an exact confidence interval on the variance in the second stage in two stage regression model. The approximate confidence interval is vased on Ting et al. (1990) method. Computer simulation is procided to show that the approximate confidence interval maintains the stated confidence coeffient.

  • PDF

2단계 사례-대조자료를 위한 로지스틱 회귀모형의 추론 (Estimation of Logistic Regression for Two-Stage Case-Control Data)

  • 신미영;신은순
    • 응용통계연구
    • /
    • 제13권2호
    • /
    • pp.237-245
    • /
    • 2000
  • 이 논문에서는 2단계 계획 하에서의 사례-대조 자료를 로지스틱 회귀 모형에 적합시키고 WESML방법으로 모수를 추정하며 추정량의 점근분포를 찾는다. 또한 WESML,방법과 CML 방법으로 얻은 모수의 추정량과 표준오차를 실제 자료를 이용하여 비교한다.

  • PDF

DEA효율성점수의 결정요인 분석방법 비교 (A Comparison of Alternative Approaches to Determinants of DEA Efficiency Scores)

  • 김성호
    • 한국경영과학회지
    • /
    • 제35권2호
    • /
    • pp.19-35
    • /
    • 2010
  • Many papers have used a two-stage approach of first calculating DEA efficiency scores and then seeking to correlate these scores with various environmental variables. Most of the studies have not checked whether such a two-stage approach is statistically valid for identifying significant environmental variables. Recently Simar and Wilson (2007) (SW) introduce a sensible data generating process and bootstrap procedure based on truncated regression for the two-stage approach. Banker and Natarajan (2008) (BN) provide a statistical foundation for the two-stage approach comprising a DEA followed by an ordinary least squares or maximum likelihood estimation. Researchers have to identify an approach suitable for their research circumstances in terms of properties, merits, demerits, and robustness to plausible departures from its chosen data generating process. We summarize the foundations and properties of the two-stage procedures suggested by SW and BN. And we discuss merits and demerits of those procedures. Also using Monte Carlo simulation we assess their relative performance under several misspecified settings.

중첩오차를 갖는 중회귀모형에서 분산의 신뢰구간 (Confidence intervals on variance components in multiple regression model with one-fold nested error strucutre)

  • 박동준
    • 한국경영과학회:학술대회논문집
    • /
    • 대한산업공학회/한국경영과학회 1996년도 춘계공동학술대회논문집; 공군사관학교, 청주; 26-27 Apr. 1996
    • /
    • pp.495-498
    • /
    • 1996
  • Regression model with nested error structure interval estimations about variability on different stages are proposed. This article derives an approximate confidence interval on the variance in the first stage and an exact confidence interval on the variance in the second stage in two stage regression model. The approximate confidence interval is based on Ting et al. (1990) method. Computer simulation is provided to show that the approximate confidence interval maintains the stated confidence coefficient.

  • PDF

Two-Stage Penalized Composite Quantile Regression with Grouped Variables

  • Bang, Sungwan;Jhun, Myoungshic
    • Communications for Statistical Applications and Methods
    • /
    • 제20권4호
    • /
    • pp.259-270
    • /
    • 2013
  • This paper considers a penalized composite quantile regression (CQR) that performs a variable selection in the linear model with grouped variables. An adaptive sup-norm penalized CQR (ASCQR) is proposed to select variables in a grouped manner; in addition, the consistency and oracle property of the resulting estimator are also derived under some regularity conditions. To improve the efficiency of estimation and variable selection, this paper suggests the two-stage penalized CQR (TSCQR), which uses the ASCQR to select relevant groups in the first stage and the adaptive lasso penalized CQR to select important variables in the second stage. Simulation studies are conducted to illustrate the finite sample performance of the proposed methods.

The Two-Stage Least Squares Regression of the Interplay between Education and Local Roads on Foreign Direct Investment in the Philippines

  • DIZON, Ricardo Laurio;CRUZ, Zita Ann Escabarte
    • The Journal of Asian Finance, Economics and Business
    • /
    • 제7권4호
    • /
    • pp.121-131
    • /
    • 2020
  • This study aims to investigate the interplay between education and local roads on Foreign Direct Investment (FDI) in the Philippines, using economic growth as an instrument. The study used the quantitative research design applying both descriptive and inferential statistics. A combination of Two Stage Least Square Regression Model and three approaches in Panel Regression Model such as Pooled Least Square, Fixed Effect Model, and Random Effect Model were utilized in order to study the effects of education and local roads on foreign direct investment of the Philippines. Based on Fixed Effect regression results, higher education graduates and local road investments, as conditioned by economic growth, were significant factors in order to increase the foreign direct investment in the Philippines. Accordingly, a unit increase in higher education graduates, as conditioned by economic growth, leads to 8.758 unit increases in the foreign direct investment. While, a unit increased in local road investments, as conditioned by economic growth, leads to a 0.002 decrease in foreign direct investment. The regression results of the study suggest that the Foreign Direct Investment in the regions such as CAR, I, II, IV-B, V, VIII, IX, X, XI, XII, XIII, and ARMM are higher compared to Region IV-A.

The Distributions of Variance Components in Two Stage Regression Model

  • Park, Dong-Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제7권1호
    • /
    • pp.87-92
    • /
    • 1996
  • A regression model with nested erroe structure is considered. The regression model includes two error terms that are independent and normally distributed with zero means and constant variances. This error structure of the model gives correlated response variables. The distributions of variance components in the regression model with nested error structure are dervied by using theorems for quadratic forms.

  • PDF

The Effect and Influencing Mechanism of TPM Factors to Performance

  • Park, Chae-Heung
    • 품질경영학회지
    • /
    • 제30권4호
    • /
    • pp.154-163
    • /
    • 2002
  • This study tries to analyze how TPM works in domestic manufacturing industry by estimating two-stage model. First stage tests the effects of five TPM-factor variables (TFV : (1) Small group activity & Autonomous maintenance, (2) Education & Training, (3) Planned maintenance, (4) improving effectiveness of each piece of facility (5) Safety & Environment) to two TPM-performance variables. Second stage tests how two TPVs affect the industry's productivity level. By combining these two stages, this study uses a model to explain how TPM, represented by TFVs, works to improve productivity via TPVs. Multivariate and univariate regression and correlation analyses were peformed. It is shown that five TFVs works in two different ways to improve the industry's productivity level. In the second stage, overall equipment effectiveness has relatively more significant effects to the productivity level.

Application of covariance adjustment to seemingly unrelated multivariate regressions

  • Wang, Lichun;Pettit, Lawrence
    • Communications for Statistical Applications and Methods
    • /
    • 제25권6호
    • /
    • pp.577-590
    • /
    • 2018
  • Employing the covariance adjustment technique, we show that in the system of two seemingly unrelated multivariate regressions the estimator of regression coefficients can be expressed as a matrix power series, and conclude that the matrix series only has a unique simpler form. In the case that the covariance matrix of the system is unknown, we define a two-stage estimator for the regression coefficients which is shown to be unique and unbiased. Numerical simulations are also presented to illustrate its superiority over the ordinary least square estimator. Also, as an example we apply our results to the seemingly unrelated growth curve models.