• Title/Summary/Keyword: Bayesian 모형

Search Result 399, Processing Time 0.028 seconds

Bayesian analysis of Korean income data using zero-inflated Tobit model (영과잉 토빗모형을 이용한 한국 소득분포 자료의 베이지안 분석)

  • Hwang, Jisu;Kim, Sei-Wan;Oh, Man-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.917-929
    • /
    • 2017
  • Korean income data obtained from Korea Labor Panel Survey shows excessive zeros, which may not be properly explained by the Tobit model. In this paper, we analyze the data using a zero-inflated Tobit model to incorporate excessive zeros. A zero-inflated Tobit model consists of two stages. In the first stage, individuals with 0 income are divided into two groups: genuine zero group and random zero group. Individuals in the genuine zero group did not participate labor market since they have no intention to do so. Individuals in the random zero group participated labor market but their incomes are very low and truncated at 0. In the second stage, the Tobit model is assumed to a subset of data combining random zeros and positive observations. Regression models are employed in both stages to obtain the effect of explanatory variables on the participation of labor market and the income amount. Markov chain Monte Carlo methods are applied for the Bayesian analysis of the data. The proposed zero-inflated Tobit model outperforms the Tobit model in model fit and prediction of zero frequency. The analysis results show strong evidence that the probability of participating in the labor market increases with age, decreases with education, and women tend to have stronger intentions on participating in the labor market than men. There also exists moderate evidence that the probability of participating in the labor market decreases with socio-economic status and reserved wage. However, the amount of monthly wage increases with age and education, and it is larger for married than unmarried and for men than women.

Review of Mixed-Effect Models (혼합효과모형의 리뷰)

  • Lee, Youngjo
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.2
    • /
    • pp.123-136
    • /
    • 2015
  • Science has developed with great achievements after Galileo's discovery of the law depicting a relationship between observable variables. However, many natural phenomena have been better explained by models including unobservable random effects. A mixed effect model was the first statistical model that included unobservable random effects. The importance of the mixed effect models is growing along with the advancement of computational technologies to infer complicated phenomena; subsequently mixed effect models have extended to various statistical models such as hierarchical generalized linear models. Hierarchical likelihood has been suggested to estimate unobservable random effects. Our special issue about mixed effect models shows how they can be used in statistical problems as well as discusses important needs for future developments. Frequentist and Bayesian approaches are also investigated.

Spatial Analysis for Mean Annual Precipitation Based On Neural Networks (신경망 기법을 이용한 연평균 강우량의 공간 해석)

  • Sin, Hyeon-Seok;Park, Mu-Jong
    • Journal of Korea Water Resources Association
    • /
    • v.32 no.1
    • /
    • pp.3-13
    • /
    • 1999
  • In this study, an alternative spatial analysis method against conventional methods such as Thiessen method, Inverse Distance method, and Kriging method, named Spatial-Analysis Neural-Network (SANN) is presented. It is based on neural network modeling and provides a nonparametric mean estimator and also estimators of high order statistics such as standard deviation and skewness. In addition, it provides a decision-making tool including an estimator of posterior probability that a spatial variable at a given point will belong to various classes representing the severity of the problem of interest and a Bayesian classifier to define the boundaries of subregions belonging to the classes. In this paper, the SANN is implemented to be used for analyzing a mean annual precipitation filed and classifying the field into dry, normal, and wet subregions. For an example, the whole area of South Korea with 39 precipitation sites is applied. Then, several useful results related with the spatial variability of mean annual precipitation on South Korea were obtained such as interpolated field, standard deviation field, and probability maps. In addition, the whole South Korea was classified with dry, normal, and wet regions.

  • PDF

Fuzzy Clustering Model using Principal Components Analysis and Naive Bayesian Classifier (주성분 분석과 나이브 베이지안 분류기를 이용한 퍼지 군집화 모형)

  • Jun, Sung-Hae
    • The KIPS Transactions:PartB
    • /
    • v.11B no.4
    • /
    • pp.485-490
    • /
    • 2004
  • In data representation, the clustering performs a grouping process which combines given data into some similar clusters. The various similarity measures have been used in many researches. But, the validity of clustering results is subjective and ambiguous, because of difficulty and shortage about objective criterion of clustering. The fuzzy clustering provides a good method for subjective clustering problems. It performs clustering through the similarity matrix which has fuzzy membership value for assigning each object. In this paper, for objective fuzzy clustering, the clustering algorithm which joins principal components analysis as a dimension reduction model with bayesian learning as a statistical learning theory. For performance evaluation of proposed algorithm, Iris and Glass identification data from UCI Machine Learning repository are used. The experimental results shows a happy outcome of proposed model.

A study on MERS-CoV outbreak in Korea using Bayesian negative binomial branching processes (베이지안 음이항 분기과정을 이용한 한국 메르스 발생 연구)

  • Park, Yuha;Choi, Ilsu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.1
    • /
    • pp.153-161
    • /
    • 2017
  • Branching processes which is used for epidemic dispersion as stochastic process model have advantages to estimate parameters by real data. We have to estimate both mean and dispersion parameter in order to use the negative binomial distribution as an offspring distribution on branching processes. In existing studies on biology and epidemiology, it is estimated using maximum-likelihood methods. However, for most of epidemic data, it is hard to get the best precision of maximum-likelihood estimator. We suggest a Bayesian inference that have good properties of statistics for small-sample. After estimating dispersion parameter we modelled the posterior distribution for 2015 Korea MERS cases. As the result, we found that the estimated dispersion parameter is relatively stable no matter how we assume prior distribution. We also computed extinction probabilities on branching processes using estimated dispersion parameters.

Despeckling and Classification of High Resolution SAR Imagery (고해상도 SAR 영상 Speckle 제거 및 분류)

  • Lee, Sang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.5
    • /
    • pp.455-464
    • /
    • 2009
  • Lee(2009) proposed the boundary-adaptive despeckling method using a Bayesian model which is based on the lognormal distribution for image intensity and a Markov random field(MRF) for image texture. This method employs the Point-Jacobian iteration to obtain a maximum a posteriori(MAP) estimate of despeckled imagery. The boundary-adaptive algorithm is designed to use less information from more distant neighbors as the pixel is closer to boundary. It can reduce the possibility to involve the pixel values of adjacent region with different characteristics. The boundary-adaptive scheme was comprehensively evaluated using simulation data and the effectiveness of boundary adaption was proved in Lee(2009). This study, as an extension of Lee(2009), has suggested a modified iteration algorithm of MAP estimation to enhance computational efficiency and to combine classification. The experiment of simulation data shows that the boundary-adaption results in yielding clear boundary as well as reducing error in classification. The boundary-adaptive scheme has also been applied to high resolution Terra-SAR data acquired from the west coast of Youngjong-do, and the results imply that it can improve analytical accuracy in SAR application.

How can the post-war reconstruction project be carried out in a stable manner? - terrorism prediction using a Bayesian hierarchical model (전후 재건사업을 안정적으로 진행하려면? - 베이지안 계층모형을 이용한 테러 예측)

  • Eom, Seunghyun;Jang, Woncheol
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.5
    • /
    • pp.603-617
    • /
    • 2022
  • Following the September 11, 2001 terrorist attacks, the United States declared war on terror and invaded Afghanistan and Iraq, winning quickly. However, interest in analyzing terrorist activities has developed as a result of a significant amount of time being spent on the post-war stabilization effort, which failed to minimize the number of terrorist activities that occurred later. Based on terrorist data from 2003 to 2010, this study utilized a Bayesian hierarchical model to forecast the terrorist threat in 2011. The model depicts spatiotemporal dependence with predictors such as population and religion by autonomous district. The military commander in charge of the region can utilize the forecast value based on the our model to prevent terrorism by deploying forces efficiently.

Socio-eoconomic impacts on human-modified hydrological drought using Copula Bayesian networks : a case study of Chungju Dam basin (Copula Bayesian networks를 활용한 수문학적 가뭄에 대한 사회경제적 인자들의 영향 평가 : 충주댐 유역을 중심으로)

  • Shin, Ji Yae;Son, Ho Jun;Kwon, Hyun-Han;Kim, Tae-Woong
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.343-343
    • /
    • 2021
  • 최근 국내외적으로 발생되는 대규모의 가뭄에 대하여 여러 과학자들은 자연적인 현상의 가뭄이 아니라 인간의 영향으로 변형된 유역 상황으로 증발산과 토양수분량 그리고 하천유량 등이 자연적인 상태와 다르게 변화되면서 지속된 가뭄으로 평가하고 있다. 우리나라는 대부분의 지역에서 댐과 저류지를 중심으로 수자원 관리가 이루어지고 있으며, 자연적인 수문과정에 의한 유출에 따른 수문학적 가뭄과는 차이가 존재한다. 사회경제적 인자(인구밀도, 농업 및 산업 경제규모 등)는 댐 및 저수지의 용수사용에 큰 영향을 미치며, 저류지의 저류량을 활용하여 판단한 인위적 용수사용이 고려된 수문학적 가뭄(인위적 수문학적 가뭄)과 자연 상태로의 수문학적 가뭄의 특성은 크게 다를 수 있다. 하지만, 사회경제적 인자들이 수문학적 가뭄에 미치는 영향에 대하여 비교한 연구는 상관성 분석을 토대로한 연구가 대부분이다. 본 연구에서는 인자들이 인위적 수문학적 가뭄에 미치는 정도를 정량적으로 비교하기 위하여 베이지안 네크워크 모형을 활용하여 사회경제적 인자와 인위적 수문학적 가뭄과의 관계를 분석하였다. 해당 관계를 바탕으로 코플라 함수를 활용함으로써 베이지안 네트워크 내의 결합확률을 산정하였다. 다양한 사회경제적 인자들에 중에서 인과지도를 바탕으로 활용 가능한 인자로 농업용수 사용량, 생공용수 사용량 자료를 구축하였으며, 기상학적 가뭄지수를 추가적으로 고려하여 한강유역 충주댐 유역에 적용하였다. 그 결과 기상학적 가뭄과 농업용수 사용량과 생공용수 사용량은 값이 증가함에 따라 인위적 수문학적 가뭄의 발생확률이 증가하였다. 사회경제적 인자 중에서는 생공용수 사용량(0.39~0.49)이 전반적으로 농업용수 사용량(0.36~0.48)보다 인위적 수문학적 가뭄에 보다 큰 영향을 미치고 있으며, 값이 적을수록 생공용수 사용량의 영향이 보다 더 크다는 것이 확인되었다. 이를 바탕으로 인위적 수문학적 가뭄의 대응을 위해서는 농업용수 사용량보다 생공용수 사용량의 감축이 우선적으로 이루어져야 그 효과가 클 것으로 판단된다. 본 연구에서 제시한 모형은 베이지안 네트워크를 기반으로 하므로, 둘 이상의 인자에 대하여 복합적으로 가뭄에 영향을 미치는 영향에 대한 추가적인 연구가 가능하다.

  • PDF

Identification of Uncertainty on the Reduction of Dead Storage in Soyang Dam Using Bayesian Stochastic Reliability Analysis (Bayesian 추계학적 신뢰도 기법을 이용한 소양강댐 퇴사용량 감소의 불확실성 분석)

  • Lee, Cheol-Eung;Kim, Sang Ug
    • Journal of Korea Water Resources Association
    • /
    • v.46 no.3
    • /
    • pp.315-326
    • /
    • 2013
  • Despite of the importance on the maintenance of a reservoir storage, relatively few studies have addressed the stochastic reliability analysis including uncertainty on the decrease of the reservoir storage by the sedimentation. Therefore, the stochastic gamma process under the reliability framework is developed and applied to estimate the reduction of the Soyang Dam reservoir storage in this paper. Especially, in the estimation of parameters of the stochastic gamma process, the Bayesian MCMC scheme using informative prior distribution is used to incorporate a wide variety of information related with the sedimentation. The results show that the selected informative prior distribution is reasonable because the uncertainty of the posterior distribution is reduced considerably compared to that of the prior distribution. Also, the range of the expected life time of the dead storage in Soyang Dam reservoir including uncertainty is estimated from 119.3 years to 183.5 years at 5% significance level. Finally, it is suggested that the improvement of the assessment strategy in this study can provide the valuable information to the decision makers who are in charge of the maintenance of a reservoir.

Theoretical Considerations for the Agresti-Coull Type Confidence Interval in Misclassified Binary Data (오분류된 이진자료에서 Agresti-Coull유형의 신뢰구간에 대한 이론적 고찰)

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.4
    • /
    • pp.445-455
    • /
    • 2011
  • Although misclassified binary data occur frequently in practice, the statistical methodology available for the data is rather limited. In particular, the interval estimation of population proportion has relied on the classical Wald method. Recently, Lee and Choi (2009) developed a new confidence interval by applying the Agresti-Coull's approach and showed the efficiency of their proposed confidence interval numerically, but a theoretical justification has not been explored yet. Therefore, a Bayesian model for the misclassified binary data is developed to consider the Agresti-Coull confidence interval from a theoretical point of view. It is shown that the Agresti-Coull confidence interval is essentially a Bayesian confidence interval.