• Title/Summary/Keyword: Markov chain 1

Search Result 304, Processing Time 0.019 seconds

Bayesian logit models with auxiliary mixture sampling for analyzing diabetes diagnosis data (보조 혼합 샘플링을 이용한 베이지안 로지스틱 회귀모형 : 당뇨병 자료에 적용 및 분류에서의 성능 비교)

  • Rhee, Eun Hee;Hwang, Beom Seuk
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.1
    • /
    • pp.131-146
    • /
    • 2022
  • Logit models are commonly used to predicting and classifying categorical response variables. Most Bayesian approaches to logit models are implemented based on the Metropolis-Hastings algorithm. However, the algorithm has disadvantages of slow convergence and difficulty in ensuring adequacy for the proposal distribution. Therefore, we use auxiliary mixture sampler proposed by Frühwirth-Schnatter and Frühwirth (2007) to estimate logit models. This method introduces two sequences of auxiliary latent variables to make logit models satisfy normality and linearity. As a result, the method leads that logit model can be easily implemented by Gibbs sampling. We applied the proposed method to diabetes data from the Community Health Survey (2020) of the Korea Disease Control and Prevention Agency and compared performance with Metropolis-Hastings algorithm. In addition, we showed that the logit model using auxiliary mixture sampling has a great classification performance comparable to that of the machine learning models.

A Search for Exoplanets around Northern Circumpolar Stars. IX. A Multi-Period Analysis of the M Giant HD 135438

  • Byeong-Cheol Lee;Jae-Rim Koo;Yeon-Ho Choi;Tae-Yang Bang;Beomdu Lim;Myeong-Gu Park;Gwanghui Jeong
    • Journal of The Korean Astronomical Society
    • /
    • v.56 no.2
    • /
    • pp.277-286
    • /
    • 2023
  • It is difficult to distinguish the pure signal produced by an orbiting planetary companion around giant stars from other possible sources, such as stellar spots, pulsations, or certain activities. Since 2003, we have obtained radial (RV) data from evolved stars using the high-resolution, fiber-fed Bohyunsan Observatory Echelle Spectrograph (BOES) at the Bohyunsan Optical Astronomy Observatory (BOAO). Here, we report the results of RV variations in the binary star HD 135438. We found two significant periods: 494.98 d with eccentricity of 0.23 and 8494.1 d with eccentricity of 0.83. Considering orbital stability, it is impossible to have two companions in such close orbits with high eccentricity. To determine the nature of the changes in the RV variability, we analyzed indicators of stellar spot and stellar chromospheric activity to find that there are no signals related to the significant period of 494.98 d. However, we calculated the upper limits of rotation period of the rotational velocity and found this to be 478-536 d. One possible interpretation is that this may be closely related to the rotational modulation of an orbital inclination at 67-90 degrees. The other signal corresponding to the period of 8494.1 d is probably associated with a stellar companion orbiting the giant star. A Markov Chain Monte Carlo (MCMC) simulation considering a single companion indicates that HD 135438 system hosts a stellar companion with 0.57+0.017 -0.017 M with an orbital period of 8498 d.

Simulation of the Phase-Type Distribution Based on the Minimal Laplace Transform (최소 표현 라플라스 변환에 기초한 단계형 확률변수의 시뮬레이션에 관한 연구)

  • Sunkyo Kim
    • Journal of the Korea Society for Simulation
    • /
    • v.33 no.1
    • /
    • pp.19-26
    • /
    • 2024
  • The phase-type, PH, distribution is defined as the time to absorption into a terminal state in a continuous-time Markov chain. As the PH distribution includes family of exponential distributions, it has been widely used in stochastic models. Since the PH distribution is represented and generated by an initial probability vector and a generator matrix which is called the Markovian representation, we need to find a vector and a matrix that are consistent with given set of moments if we want simulate a PH distribution. In this paper, we propose an approach to simulate a PH distribution based on distribution function which can be obtained directly from moments. For the simulation of PH distribution of order 2, closed-form formula and streamlined procedures are given based on the Jordan decomposition and the minimal Laplace transform which is computationally more efficient than the moment matching methods for the Markovian representation. Our approach can be used more effectively than the Markovian representation in generating higher order PH distribution in queueing network simulation.

A Study on derivation of drought severity-duration-frequency curve through a non-stationary frequency analysis (비정상성 가뭄빈도 해석 기법에 따른 가뭄 심도-지속기간-재현기간 곡선 유도에 관한 연구)

  • Jeong, Minsu;Park, Seo-Yeon;Jang, Ho-Won;Lee, Joo-Heon
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.2
    • /
    • pp.107-119
    • /
    • 2020
  • This study analyzed past drought characteristics based on the observed rainfall data and performed a long-term outlook for future extreme droughts using Representative Concentration Pathways 8.5 (RCP 8.5) climate change scenarios. Standardized Precipitation Index (SPI) used duration of 1, 3, 6, 9 and 12 months, a meteorological drought index, was applied for quantitative drought analysis. A single long-term time series was constructed by combining daily rainfall observation data and RCP scenario. The constructed data was used as SPI input factors for each different duration. For the analysis of meteorological drought observed relatively long-term since 1954 in Korea, 12 rainfall stations were selected and applied 10 general circulation models (GCM) at the same point. In order to analyze drought characteristics according to climate change, trend analysis and clustering were performed. For non-stationary frequency analysis using sampling technique, we adopted the technique DEMC that combines Bayesian-based differential evolution ("DE") and Markov chain Monte Carlo ("MCMC"). A non-stationary drought frequency analysis was used to derive Severity-Duration-Frequency (SDF) curves for the 12 locations. A quantitative outlook for future droughts was carried out by deriving SDF curves with long-term hydrologic data assuming non-stationarity, and by quantitatively identifying potential drought risks. As a result of performing cluster analysis to identify the spatial characteristics, it was analyzed that there is a high risk of drought in the future in Jeonju, Gwangju, Yeosun, Mokpo, and Chupyeongryeong except Jeju corresponding to Zone 1-2, 2, and 3-2. They could be efficiently utilized in future drought management policies.

Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

  • Jeong, Young-Seob;Jin, Sou-Young;Choi, Ho-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.1
    • /
    • pp.81-98
    • /
    • 2013
  • Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.

The Impact of Anthropogenic Land Cover Change on Degradation of Grade in Ecology and Nature Map (생태자연도 등급 하락에 영향을 미치는 인위적 토지피복 변화 분석)

  • Choi, Chul-Hyun;Lim, Chi-Hong;Lee, Sung-Je;Seo, Hyun-Jin
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.22 no.6
    • /
    • pp.77-87
    • /
    • 2019
  • The first grade zones in Ecology and Nature Map are important regions for the conservation of the ecosystem, but it would be degraded by various anthropogenic factors. This study analyzes the relationship between potential land cover change and degradation of the first grade zones using land cover transition probability. As a result, it was shown that most of the first grade zones with degraded were converted from forest to urban(5.1%), cropland(27.2%), barren(11.0%) and grass(27.5%) in Gangwon and forest to urban(18.0%), cropland(15.3%), grass(28.4%), barren(12.3%) in Gyeonggi. The result of the logistic regression analysis showed that the probability of degradation of first grade zone was higher in area where was expected the higher probability of urban, cropland, barren, grass transition. The barren transition probability was the most influential and grass was the next highest. There were regional differences in the probability of urban transition and cropland transition, and the urban transition probability was more influential in Gyeonggi-do. This is because development pressure such as housing site development is high in Gyeonggi-do. Due to the limitations of the Act on Mountain Districts Management, even in the first grade zones, the grade may be degraded. Therefore, if Ecology and Nature Map are used to prevent deforestation or conversion of mountainous districts, it may contribute to the preservation of the ecosystem.

Secure MAP Discovery Schemes in Hierarchical MIPv6 (계층적 Mobile IPv6에서의 안전한 MAP 검색 기법)

  • Choi, Jong-Hyoun;Mun, Young-Song
    • Journal of KIISE:Information Networking
    • /
    • v.34 no.1
    • /
    • pp.41-47
    • /
    • 2007
  • The Hierarchical Mobile IPv6 (HMIPv6) has been proposed to accommodate frequent mobility of the Mobile Node and to reduce the signaling load. A Mobility Anchor Point is a router located in a network visited by the Mobile Node. The Mobile Node uses the Mobile Anchor Point as a local Home Agent. The absence of any protections between Mobile Node and Mobile Anchor Point may lead to malicious Mobile Nodes impersonating other legitimate ones or impersonating a Mobile Anchor Point. In this paper, we propose a mechanism of the secure Mobile Anther Point discovery in HMIPv6. The performance analysis and the numerical results presented in this paper show that our proposal has superior performance to other methods.

Analysis of Periodicity of Meteorological Measures and Their Effects on Precipitation Observed with Surface Meteorological Instruments at Eight Southwestern Areas, Korea during 2004KOEP (기상인자의 주기성 분석 및 일반화 선형모형을 이용한 강수영향분석: 2004KEOP의 한반도 남서지방 8개 지역 기상관측자료사용)

  • Kim Hea-Jung;Yum Joonkeun;Lee Yung-Seop;Kim Young-Ah;Chung Hyo-Sang;Cho Chun-Ho
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.281-296
    • /
    • 2005
  • This article summarizes our research on estimation of area-specific and time-adjusted rainfall rates during 2004KEOP (Korea enhanced observation period: June 1, $2004{\sim}$ August 31, 2004). The rainfall rate is defined as the proportion of rainfall days per week and areas are consisting of Haenam, Yeosu, Janghung, Heuksando, Gwangju, Mokpo, Jindo, and Wando. Our objectives are to analyze periodicity in area-specific precipitation and the meteorological measures and investigate the relationships between the geographic pattern of the rainfall rates and the corresponding pattern in potential explanatory covariates such as temperature, wind, wind direction, pressure, and humidity. A generalized linear model is introduced to implement the objectives and the patterns are estimated by considering a set of rainfall rates produced using samples from the posterior distribution of the population rainfall rates.

Survival Analysis of Gastric Cancer Patients with Incomplete Data

  • Moghimbeigi, Abbas;Tapak, Lily;Roshanaei, Ghodaratolla;Mahjub, Hossein
    • Journal of Gastric Cancer
    • /
    • v.14 no.4
    • /
    • pp.259-265
    • /
    • 2014
  • Purpose: Survival analysis of gastric cancer patients requires knowledge about factors that affect survival time. This paper attempted to analyze the survival of patients with incomplete registered data by using imputation methods. Materials and Methods: Three missing data imputation methods, including regression, expectation maximization algorithm, and multiple imputation (MI) using Monte Carlo Markov Chain methods, were applied to the data of cancer patients referred to the cancer institute at Imam Khomeini Hospital in Tehran in 2003 to 2008. The data included demographic variables, survival times, and censored variable of 471 patients with gastric cancer. After using imputation methods to account for missing covariate data, the data were analyzed using a Cox regression model and the results were compared. Results: The mean patient survival time after diagnosis was $49.1{\pm}4.4$ months. In the complete case analysis, which used information from 100 of the 471 patients, very wide and uninformative confidence intervals were obtained for the chemotherapy and surgery hazard ratios (HRs). However, after imputation, the maximum confidence interval widths for the chemotherapy and surgery HRs were 8.470 and 0.806, respectively. The minimum width corresponded with MI. Furthermore, the minimum Bayesian and Akaike information criteria values correlated with MI (-821.236 and -827.866, respectively). Conclusions: Missing value imputation increased the estimate precision and accuracy. In addition, MI yielded better results when compared with the expectation maximization algorithm and regression simple imputation methods.

A New Bootstrap Simulation Method for Intermittent Demand Forecasting (간헐적 수요예측을 위한 부트스트랩 시뮬레이션 방법론 개발)

  • Park, Jinsoo;Kim, Yun Bae;Lee, Ha Neul;Jung, Gisun
    • Journal of the Korea Society for Simulation
    • /
    • v.23 no.3
    • /
    • pp.19-25
    • /
    • 2014
  • Demand forecasting is the basis of management activities including marketing strategy. Especially, the demand of a part is remarkably important in supply chain management (SCM). In the fields of various industries, the part demand usually has the intermittent characteristic. The intermittent characteristic implies a phenomenon that there frequently occurs zero demands. In the intermittent demands, non-zero demands have large variance and their appearances also have stochastic nature. Accordingly, in the intermittent demand forecasting, it is inappropriate to apply the traditional time series models and/or cause-effect methods such as linear regression; they cannot describe the behaviors of intermittent demand. Markov bootstrap method was developed to forecast the intermittent demand. It assumes that first-order autocorrelation and independence of lead time demands. To release the assumption of independent lead time demands, this paper proposes a modified bootstrap method. The method produces the pseudo data having the characteristics of historical data approximately. A numerical example for real data will be provided as a case study.