• Title/Summary/Keyword: 비모수 모형

Search Result 395, Processing Time 0.024 seconds

Variable Selection in Normal Mixture Model Based Clustering under Heteroscedasticity (이분산 상황 하에서 정규혼합모형 기반 군집분석의 변수선택)

  • Kim, Seung-Gu
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1213-1224
    • /
    • 2011
  • In high dimensionality where the number of variables are excessively larger than observations, it is required to remove the noninformative variables to cluster observations. Most model-based approaches for variable selection have been considered under the assumption of homoscedasticity and their models are mainly estimated by a penalized likelihood method. In this paper, a different approach is proposed to remove the noninformative variables effectively and to cluster based on the modified normal mixture model simultaneously. The validity of the model was provided and an EM algorithm was derived to estimate the parameters. Simulation studies and an experiment using real microarray dataset showed the effectiveness of the proposed method.

Stochastic projection on international migration using Coherent functional data model (일관성 함수적 자료모형을 활용한 국제인구이동의 확률적 예측)

  • Kim, Soon-Young;Oh, Jinho
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.4
    • /
    • pp.517-541
    • /
    • 2019
  • According to the OECD (2015) and UN (2017), Korea was classified as an immigration country. The designation as an immigration country means that net migration will remain positive and international migration is likely to affect population growth. KOSTAT (2011) used a model with more than 15 parameters to divide sexes, immigration and emigration based on the Wilson (2010) model, which takes into account population migration factors. Five years later, we assume the average of domestic net migration rate for the last five years and foreign government policy likely quota. However, both of these results were conservative estimates of international migration and provide different results than those used by the OECD and UN to classify an immigration country. In this paper, we proposed a stochastic projection on international migration using nonparametric model (FDM by Hyndman and Ullah (2007) and Coherent FDM by Hyndman et al. (2013)) that uses a functional data model for the international migration data of Korea from 2000-2017, noting the international migration such as immigration, emigration and net migration is non-linear and not linear. According to the result, immigration rate will be 1.098(male), 1.026(female) in 2018 and 1.228(male), 1.152(female) in 2025 per 1000 population, and the emigration rate will be 0.907(male), 0.879(female) in 2018 and 0.987(male), 0.959(female) in 2025 per 1000 population. Thus the net migration is expected to increase to 0.191(male), 0.148(female) in 2018 and 0.241(male), 0.192(female) in 2025 per 1000 population.

Prediction of Divided Traffic Demands Based on Knowledge Discovery at Expressway Toll Plaza (지식발견 기반의 고속도로 영업소 분할 교통수요 예측)

  • Ahn, Byeong-Tak;Yoon, Byoung-Jo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.36 no.3
    • /
    • pp.521-528
    • /
    • 2016
  • The tollbooths of a main motorway toll plaza are usually operated proactively responding to the variations of traffic demands of two-type vehicles, i.e. cars and the other (heavy) vehicles, respectively. In this vein, it is one of key elements to forecast accurate traffic volumes for the two vehicle types in advanced tollgate operation. Unfortunately, it is not easy for existing univariate short-term prediction techniques to simultaneously generate the two-vehicle-type traffic demands in literature. These practical and academic backgrounds make it one of attractive research topics in Intelligent Transportation System (ITS) forecasting area to forecast the future traffic volumes of the two-type vehicles at an acceptable level of accuracy. In order to address the shortcomings of univariate short-term prediction techniques, a Multiple In-and-Out (MIO) forecasting model to simultaneously generate the two-type traffic volumes is introduced in this article. The MIO model based on a non-parametric approach is devised under the on-line access conditions of large-scale historical data. In a feasible test with actual data, the proposed model outperformed Kalman filtering, one of a widely-used univariate models, in terms of prediction accuracy in spite of multivariate prediction scheme.

A Generalized Marginal Logit Model for Repeated Polytomous Response Data (반복측정의 다가 반응자료에 대한 일반화된 주변 로짓모형)

  • Choi, Jae-Sung
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.4
    • /
    • pp.621-630
    • /
    • 2008
  • This paper discusses how to construct a generalized marginal logit model for analyzing repeated polytomous response data when some factors are applied to larger experimental units as treatments and time to a smaller experimental unit as a repeated measures factor. So, two different experimental sizes are considered. Weighted least squares(WLS) methods are used for estimating fixed effects in the suggested model.

Hierachical Bayes Estimation of Small Area Means in Repeated Survey (반복조사에서 소지역자료 베이지안 분석)

  • 김달호;김남희
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.119-128
    • /
    • 2002
  • In this paper, we consider the HB estimators of small area means with repeated survey. mao and Yu(1994) considered small area model with repeated survey data and proposed empirical best linear unbiased estimators. We propose a hierachical Bayes version of Rao and Yu by assigning prior distributions for unknown hyperparameters. We illustrate our HB estimator using very popular data in small area problem and then compare the results with the estimator of Census Bureau and other estimators previously proposed.

Estimation of the Survival Function under Extreme Right Censoring Model (극단적인 오른쪽 관측중단모형에서 생존함수의 추정)

  • Lee, Jae-Man
    • Journal of the Korean Data and Information Science Society
    • /
    • v.11 no.2
    • /
    • pp.225-233
    • /
    • 2000
  • In life-testing experiments, in which the longest time an experimental unit is on test is not a failure time, but rather a censored observation. For the situation the Kaplan-Meier estimator is known to be a baised estimator of the survival function. Several modifications of the Kaplan-Meier estimator are examined and compared with bias and mean squared error.

  • PDF

The Comparative Study of Software Optimal Release Time Based on Log property Distribution (로그형 특성분포에 근거한 소프트웨어 최적 방출시기에 관한 비교 연구)

  • Kim, Hee-Cheul;Park, Hyoung-Keun
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.149-152
    • /
    • 2010
  • 본 연구에서는 소프트웨어 제품을 개발하여 테스팅을 거친 후 사용자에게 인도하는 시기를 결정하는 방출문제에 대하여 연구되었다. 인도시기에 관한 모형은 무한 고장 수에 의존하는 비동질적인 포아송 과정을 적용하였다. 이러한 포아송 과정은 소프트웨어의 결함을 제거하거나 수정 작업 중에도 새로운 결함이 발생될 가능성을 반영하는 모형이다. 적용모형은 여러 수명 분포들을 적합시키는데 효율적인 특성을 가진 콤페르쯔, 파레토, 로그-로지스틱 모형과 같은 로그형 특성분포를 이용하였다. 따라서 소프트웨어 요구 신뢰도를 만족시키고 소프트웨어 개발 및 유지 총비용을 최소화 시키는 방출시간이 최적 소프트웨어 방출 정책이 된다. 본 논문의 수치적인 예에서는 고장 간격 시간 자료를 적용하고 모수추정 방법은 최우추정법을 이용하여 최적 방출시기를 추정하였다.

  • PDF

Mixed effects least squares support vector machine for survival data analysis (생존자료분석을 위한 혼합효과 최소제곱 서포트벡터기계)

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.739-748
    • /
    • 2012
  • In this paper we propose a mixed effects least squares support vector machine (LS-SVM) for the censored data which are observed from different groups. We use weights by which the randomly right censoring is taken into account in the nonlinear regression. The weights are formed with Kaplan-Meier estimates of censoring distribution. In the proposed model a random effects term representing inter-group variation is included. Furthermore generalized cross validation function is proposed for the selection of the optimal values of hyper-parameters. Experimental results are then presented which indicate the performance of the proposed LS-SVM by comparing with a standard LS-SVM for the censored data.

국가간에 기술혁신수검속도측정에 관한 연구: 한국, 미국, 일본, 중국을 중심으로

  • 권성혁;조상섭
    • Proceedings of the Korea Technology Innovation Society Conference
    • /
    • 2005.10a
    • /
    • pp.184-198
    • /
    • 2005
  • 본 연구는 우리나라를 중심으로 볼 때, 중요한 국가 간에 기술혁신의 수렴존재와 기술혁신의 수렴기간에 관한 실증연구를 목적으로 하였다. 분석결과를 요약하면 다음과 같다. 첫째, 선형적인 기술혁신의 확산을 가정한 결과로 볼 때, 분석대상국가간에 분석대산 국가 간에 기술혁신의 확산이 이루어지고 있지 않았다. 둘째, 보다 직접적인 기술혁신존재 및 확산기간을 측정한 결과 선형을 가정한 AR(1)모형에서 추정한 계수 값이 일본과 한국의 경우를 제외하고는 정상적인 형태로 나타났다. 셋째, 모수적 추정과 선형의 경우가 비모수적 추정과 비선형의 경우보다 기술혁신에 따른 수렴기간이 길게 나타났다. 특히 일본과 우리나라의 경우에는 선형을 가정하는 경우에는 기술혁신에 따른 수렴현상이 나타나지 않은 반면, 비선형인 경우에는 수렴현상이 나타났다. 마지막으로 미국과 우리나라의 기술혁신에 따른 수렴기간이 우리나라와 중국의 기술혁신에 따른 수렴기간보다 길게 나타났다. 따라서 기술혁신확산 및 수렴기간이 장기적인 경우 선형적인 기술혁신존재검증을 어렵게 할 수 있다는 가능성을 보여준다. 연구결과 정책적 시사점은 두 가지로 요약될 수 있다 첫째, 우리나라의 경우에 기술혁신수준이 중국이 쉽게 모방할 수 있는 정도로 매우 낮다고 볼 수 있다. 둘째, 우리나라에 대한 중국의 기술혁신 모방에 대한 정책적 시사점으로는 우리나라 기술혁신의 주기가 짧으면서 자주 일어나야한다.

  • PDF

Estimation of genetic parameter for carcass traits in commercial Hanwoo steer (일반농가 한우의 도체형질에 관한 유전모수 추정)

  • Lee, Yoonseok;Lee, Jea Young
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.741-747
    • /
    • 2016
  • The aim of study was to estimate genetic parameter of carcass traits in commercial Hanwoo steer using national animal model for selection of superior bull. Analyzed data (n=5,843) on carcass traits was collected from 107,020 Hanwoo steer. The animal model was used to estimate heritability and genetic correlations. The estimated heritability of carcass traits were 0.19, 0.17, 0.20 and 0.23 for carcass weight, eye muscle area, backfat thickness and marbling score, respectively. The estimated heritability for carcass traits in commercial Hanwoo are low than estimated heritability of national progeny test population for selection of superior bull because breeding environment, genetic performance of cow and feeding day was different. Therefore, we suggests that animal model can include practical genetic variable based on national animal model to improve genetic performance in commercial Hanwoo.