• Title/Summary/Keyword: 순서형 다항 자료

Search Result 3, Processing Time 0.015 seconds

희박다항분포확률에 대한 국소최대우도 추정량

  • Baek, Jang-Seon
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.05a
    • /
    • pp.29-34
    • /
    • 2002
  • $p=(p_{}1,p_{2},{\cdots},p_{k})^{T}$의 확률벡터를 가진 다항분포로부터 관측된 칸 돗수(cell frequency) 벡터가 $N=(N_{1},N_{2},{\cdots},N_{k})^{T}$이며 ${\sum}{\limits}_{j=1}^{k}N_{j}=n$이라 하자. 총돗수 n이 칸의 총갯수 k에 비하여 상대적으로 매우 작을 때 이러한 이산형 자료를 희박다항분포자료(sparse multinomial data)라 한다. 이러한 희박다항분포자료의 칸들이 순서화 되어 있을 때 우리는 i번째 칸의 확률 $p_{i}$를 돗수 추정량 $N_{j}/n$ 들을 평활함으로써 추정 할 수 있다. Aerts, et al.(1997)과 Baek(1998) 등에 의해 제안된 국소최소제곱기준에 근거한 국소다항커널추정량은 희박점근일치성의 좋은 성질을 가짐에도 불구하고 확률추정지가 음수값을 가질 수 있는 단점을 내포하고 있다. 본 연구에서는 이러한 단점을 극복하기 위하여 국소최대우도 기준에 근거한 새로운 커널추정량을 제안하고, 그것의 점근적 성질을 연구하였다.

  • PDF

Building credit scoring models with various types of target variables (목표변수의 형태에 따른 신용평점 모형 구축)

  • Woo, Hyun Seok;Lee, Seok Hyung;Cho, HyungJun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.85-94
    • /
    • 2013
  • As the financial market becomes larger, the loss increases due to the failure of the credit risk managements from the poor management of the customer information or poor decision-making. Thus, the credit risk management also becomes more important and it is essential to develop a credit scoring model, which is a fundamental tool used to minimize the credit risk. Credit scoring models have been studied and developed only for binary target variables. In this paper, we consider other types of target variables such as ordinal multinomial data or longitudinal binary data and suggest credit scoring models. We then apply our developed models to real data and random data, and investigate their performance through Kolmogorov-Smirnov statistic.

Bayesian Analysis of Korean Alcohol Consumption Data Using a Zero-Inflated Ordered Probit Model (영 과잉 순서적 프로빗 모형을 이용한 한국인의 음주자료에 대한 베이지안 분석)

  • Oh, Man-Suk;Oh, Hyun-Tak;Park, Se-Mi
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.363-376
    • /
    • 2012
  • Excessive zeroes are often observed in ordinal categorical response variables. An ordinary ordered Probit model is not appropriate for zero-inflated data especially when there are many different sources of generating 0 observations. In this paper, we apply a two-stage zero-inflated ordered Probit (ZIOP) model which incorporate the zero-flated nature of data, propose a Bayesian analysis of a ZIOP model, and apply the method to alcohol consumption data collected by the National Bureau of Statistics, Korea. In the first stage of a ZIOP model, a Probit model is introduced to divide the non-drinkers into genuine non-drinkers who do not participate in drinking due to personal beliefs or permanent health problems and potential drinkers who did not drink at the time of the survey but have the potential to become drinkers. In the second stage, an ordered probit model is applied to drinkers that consists of zero-consumption potential drinkers and positive consumption drinkers. The analysis results show that about 30% of non-drinkers are genuine non-drinkers and hence the Korean alcohol consumption data has the feature of zero-inflated data. A study on the marginal effect of each explanatory variable shows that certain explanatory variables have effects on the genuine non-drinkers and potential drinkers in opposite directions, which may not be detected by an ordered Probit model.