• Title/Summary/Keyword: 선택편의 모형

Search Result 101, Processing Time 0.022 seconds

A study on bias effect of LASSO regression for model selection criteria (모형 선택 기준들에 대한 LASSO 회귀 모형 편의의 영향 연구)

  • Yu, Donghyeon
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.643-656
    • /
    • 2016
  • High dimensional data are frequently encountered in various fields where the number of variables is greater than the number of samples. It is usually necessary to select variables to estimate regression coefficients and avoid overfitting in high dimensional data. A penalized regression model simultaneously obtains variable selection and estimation of coefficients which makes them frequently used for high dimensional data. However, the penalized regression model also needs to select the optimal model by choosing a tuning parameter based on the model selection criterion. This study deals with the bias effect of LASSO regression for model selection criteria. We numerically describes the bias effect to the model selection criteria and apply the proposed correction to the identification of biomarkers for lung cancer based on gene expression data.

Financial performance analysis of guaranteed firms using propensity scores (성향점수를 활용한 보증기업의 재무성과 분석)

  • Nam, Joo-Ha;Kim, Jung-Ryol;Noh, Maengseok
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.2
    • /
    • pp.389-398
    • /
    • 2016
  • In this paper, we examine the financial performance of credit guarantee programs. We compared financial performance of guaranteed firms of KODIT and non-guaranteed firms. The of covariate adjusted propensity score method is used because a selection bias problem could occur if t-test or regression analysis were used. The results show that a credit guarantee program enhances the financial performance of beneficiary firms.

Inherent Random Heterogeneity Logit Model for Stated Preference Freight Mode Choice (SP 화물수단선택을 위한 Inherent Random Heterogeneity 로짓 모형 연구)

  • KIM, Kang-Soo
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.3
    • /
    • pp.83-92
    • /
    • 2002
  • Freight mode choice models are essential to the analysis of many areas of transport research. However, observations of actual market choices have only been made in a limited number of situations. Therefore, stated preference(SP) techniques have emerged as an alternative source of actual market choices to be used for estimating freight mode choice models. Considerable confidence exists about SP data, but little consideration has been given to the potential for estimation bias. This paper has been motivated by the theoretical side of estimating SP discrete choice models, focusing on a case study of freight mode choice. Recently developed simulation methods are used to construct inherent random heterogeneity legit models, which consider individual heterogeneity, its inheritance to the next choices and overcome the independence from irrelevant alternatives (IIA) property. This Paper contributes to the development of models dealing with heterogeneity and its inheritance, and sheds light on the heterogeneity of freight transport.

The wage determinants applying sample selection bias (표본선택 편의를 반영한 임금결정요인 분석)

  • Park, Sungik;Cho, Jangsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.5
    • /
    • pp.1317-1325
    • /
    • 2016
  • The purpose of this paper is to explain the factors affecting the wage of the vocational high school graduates. We particularly examine the effectiveness of controlling sample selection bias by employing the Tobit model and Heckman sample selection model. The major results are as follows. First it is shown that the Tobit model and Heckman sample selection model controlling sample selection bias is statistically significant. Hence all the independent variables seem to be statistically consistent with the theoretical model. Second, gender was statistically significant, both in the probability of employment and the wage. Third, the employment probability and wage of Maester high school graduates were shown to be high compared to all other graduates. Fourth, the higher parent's income, the higher are both the employment probability and the wage. Finally, parents education level, high school grade, satisfaction, and a number of licenses were found to be statistically significant, both in the probability of employment and wages.

Korean women wage analysis using selection models (표본 선택 모형을 이용한 국내 여성 임금 데이터 분석)

  • Jeong, Mi Ryang;Kim, Mijeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1077-1085
    • /
    • 2017
  • In this study, we have found the major factors which affect Korean women's wage analysing the data provided by 2015 Korea Labor Panel Survey (KLIPS). In general, wage data is difficult to analyze because random sampling is infeasible. Heckman sample selection model is the most widely used method for analysing the data with sample selection. Heckman proposed two kinds of selection models: the one is the model with maximum likelihood method and the other is the Heckman two stage model. Heckman two stage model is known to be robust to the normal assumption of bivariate error terms. Recently, Marchenko and Genton (2012) proposed the Heckman selectiont model which generalizes the Heckman two stage model and concluded that Heckman selection-t model is more robust to the error assumptions. Employing the two models, we carried out the analysis of the data and we compared those results.

Corporate Debt Choice: Application of Panel Sample Selection Model (기업의 부채조달원 선택에 관한 연구: 패널표본선택모형의 적용)

  • Lee, Ho Sun
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.7
    • /
    • pp.428-435
    • /
    • 2015
  • When I examined the corporate financing statistics in Korea, I have recognized that there are several trends of them. First, large enterprises use bank loan and direct financing like corporate bond as debt. Second, small and medium companies mainly use bank loan only. So I argue that there is sample selection bias in corporate debt choice and using sample selection methodology is more adequate when analysing the behavior in corporate debt choice. Therefore I have tested panel sample selection model, using the listed korean firm data from 1990 to 2013 and I have found that the panel sample selection model is appropriate.

Joint penalization of components and predictors in mixture of regressions (혼합회귀모형에서 콤포넌트 및 설명변수에 대한 벌점함수의 적용)

  • Park, Chongsun;Mo, Eun Bi
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.199-211
    • /
    • 2019
  • This paper is concerned with issues in the finite mixture of regression modeling as well as the simultaneous selection of the number of mixing components and relevant predictors. We propose a penalized likelihood method for both mixture components and regression coefficients that enable the simultaneous identification of significant variables and the determination of important mixture components in mixture of regression models. To avoid over-fitting and bias problems, we applied smoothly clipped absolute deviation (SCAD) penalties on the logarithm of component probabilities suggested by Huang et al. (Statistical Sinica, 27, 147-169, 2013) as well as several well-known penalty functions for coefficients in regression models. Simulation studies reveal that our method is satisfactory with well-known penalties such as SCAD, MCP, and adaptive lasso.

이중 양분선택형 질문 CV자료에서의 정박효과 검토

  • Sin, Yeong-Cheol
    • Environmental and Resource Economics Review
    • /
    • v.8 no.1
    • /
    • pp.51-73
    • /
    • 1998
  • 조건부가치측정법(CVM)의 지불의사 유도방법인 이중 양분선택형 질문법은 단일 양분선택형 질문 CV자료의 통계적 비효율성을 극복하기 위한 방법으로 제안되었다. 이 방법은 여러 가지 장점에도 불구하고 출발점 편의의 심리학적 근거인 정박효과 (anchoring effect)의 발생 가능성을 의심받고 있다. 그러므로 본 논문에서는 이중 양분선택형 질문 CV자료에서 정박효과를 검토할 수 있는 일반적 지불의사금액모형을 제시하고, 그 모형으로부터 정박효과를 검토할 수 있는 방법을 제안한다. 모형은, Cameron and Quiggin(1994)이 제안한 이변량 모형에 두 번째 내재 지불의사금액의 설명변수로서 처음 제시된 특정금액에 대한 양분선택적 응답결과를 포함시킨 형태이다. 이 모형에서 처음 제시된 특정금액에 대한 양분선택적 응답결과의 계수 부호가 음(-)이고 통계적으로 유의하다면 정박효과가 발생하는 것으로 볼 수 있다. 그러나 만약 이러한 계수 검토에서 정박효과 발생을 확인할 수 없는 경우, 두 번의 응답에서 두 지불의사금액 추정치들의 평균이 다르다고 볼 수 없다면 정박효과를 우려할 필요가 없다. 이 검토 모형 및 방법을 본 연구에서 한강 수질 개선에 대한 CV자료에 적용해 본 결과 정박효과를 우려할 필요가 없음을 확인할 수 있다.

  • PDF

An Analysis of the Factors of Youth Unemployment and Nonparticipation in Korea (청년층 미취업의 실태 및 원인 분석)

  • Kim, Ahnkook
    • Journal of Labour Economics
    • /
    • v.26 no.1
    • /
    • pp.23-52
    • /
    • 2003
  • This study focus on unemployment and nonparticipation of youth. By dividing youth nonparticipants into 'house work and child care', 'studying and training', 'the others' categories, we estimate the potential wages with selectivity bias model and analyse the factors of choosing unemployment or nonparticipation with multinomial logit model. The differences between the potential market wage and the desired wage of the groups of 'studying and training', 'the others' in the nonparticipants are greater than those of the unemployment group. In the case of the man and lower age, and low schooling the differences of potential and desire wage are larger than woman and higher age, and high schooling. In the choice of unemployment and nonparticipation, man and higher age, and householder, and holder of qualification are not likely to opt nonparticipation. The experience of job lower the rate of probability to choose employment, but raise the rate of probability to choose unemployment and nonparticipation. These results mean that the quality of youth employment is very inferior.

  • PDF

An Alternative Parametric Estimation of Sample Selection Model: An Application to Car Ownership and Car Expense (비정규분포를 이용한 표본선택 모형 추정: 자동차 보유와 유지비용에 관한 실증분석)

  • Choi, Phil-Sun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.3
    • /
    • pp.345-358
    • /
    • 2012
  • In a parametric sample selection model, the distribution assumption is critical to obtain consistent estimates. Conventionally, the normality assumption has been adopted for both error terms in selection and main equations of the model. The normality assumption, however, may excessively restrict the true underlying distribution of the model. This study introduces the $S_U$-normal distribution into the error distribution of a sample selection model. The $S_U$-normal distribution can accommodate a wide range of skewness and kurtosis compared to the normal distribution. It also includes the normal distribution as a limiting distribution. Moreover, the $S_U$-normal distribution can be easily extended to multivariate dimensions. We provide the log-likelihood function and expected value formula based on a bivariate $S_U$-normal distribution in a sample selection model. The results of simulations indicate the $S_U$-normal model outperforms the normal model for the consistency of estimators. As an empirical application, we provide the sample selection model for car ownership and a car expense relationship.