• Title/Summary/Keyword: Models, statistical

Search Result 3,026, Processing Time 0.022 seconds

Latent class model for mixed variables with applications to text data (혼합모드 잠재범주모형을 통한 텍스트 자료의 분석)

  • Shin, Hyun Soo;Seo, Byungtae
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.6
    • /
    • pp.837-849
    • /
    • 2019
  • Latent class models (LCM) are useful tools to draw hidden information from categorical data. This model can also be interpreted as a mixture model with multinomial component distributions. In some cases, however, an available dataset may contain both categorical and count or continuous data. For such cases, we can extend the LCM to a mixture model with both multinomial and other component distributions such as normal and Poisson distributions. In this paper, we consider a LCM for the data containing categorical and count data to analyze the Drug Review dataset which contains categorical responses and text review. From this data analysis, we show that we can obtain more specific hidden inforamtion than those from the LCM only with categorical responses.

Effect of an unsampled population on the estimation of a population size (집단 크기 추정에 대한 미표본 집단의 영향)

  • Chung, Yujin
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.347-355
    • /
    • 2020
  • An Isolation-with-Migration (IM) model is used to estimate extant population sizes, the splitting time of populations split away from their common ancestral populations, and migration rates between the extant populations. An evolutionary model such as IM models is estimated by analyzing DNA sequences sampled from the extant populations in the model. When a true model includes an unsampled 'ghost' population without data, the unsampled population is often ignored from the evolutionary model to infer. In this paper, we conduct a simulation study to investigate the effect of an unsampled population on the estimation of the size of the sampled population. When there exists an unsampled population that shares migrations with the sampled population, the size estimation of the sampled population was biased. However, the size estimation was improved if an evolutionary model, including the unsampled population, was estimated.

Modelling Missing Traffic Volume Data using Circular Probability Distribution (순환확률분포를 이용한 교통량 결측자료 보정 모형)

  • Kim, Hyeon-Seok;Im, Gang-Won;Lee, Yeong-In;Nam, Du-Hui
    • Journal of Korean Society of Transportation
    • /
    • v.25 no.4
    • /
    • pp.109-121
    • /
    • 2007
  • In this study, an imputation model using circular probability distribution was developed in order to overcome problems of missing data from a traffic survey. The existing ad-hoc or heuristic, model-based and algorithm-based imputation techniques were reviewed through previous studies, and then their limitations for imputing missing traffic volume data were revealed. The statistical computing language 'R' was employed for model construction, and a mixture of von Mises probability distribution, which is classified as symmetric, and unimodal circular probability were finally fitted on the basis of traffic volume data at survey stations in urban and rural areas, respectively. The circular probability distribution model largely proved to outperform a dummy variable regression model in regards to various evaluation conditions. It turned out that circular probability distribution models depict circularity of hourly volumes well and are very cost-effective and robust to changes in missing mechanisms.

Assessment of the Glycophorin A Mutant Assay as a Biologic Marker for Low Dose Radiation Exposure (저선량 방사선 노출에 대한 생물학적 지표로서 Glycophorin A 변이발현율 측정의 유용성 평가)

  • Ha, Mi-Na;Yoo, Keun-Young;Ha, Sung-Whan;Kim, Dong-Hyun;Cho, Soo-Hun
    • Journal of Preventive Medicine and Public Health
    • /
    • v.33 no.2
    • /
    • pp.165-173
    • /
    • 2000
  • Objectives : To assess the availability of the glycophorin A (GPA) assay to detect the biological effect of ionizing radiation in workers exposed to low-doses of radiation. Methods : Information on confounding factors, such as age and cigarette smoking was obtained on 144 nuclear power plant workers and 32 hospital workers, by a self-administered questionnaire. Information on physical exposure levels was obtained from the registries of radiation exposure monitoring and control at each facility. The GPA mutant assay was performed using the BR6 method with modification by using a FACScan flow cytometer. Results : As confounders, age and cigarette smoking habits showed increasing trends with GPA variants, but these were of no statistical significance. Hospital workers showed a higher frequency of the GPA variant than nuclear power plant workers in terms of the NO variant. Significant dose-response relationships were obtained from in simple and multiple linear regression models. The slope of the regression equation for nuclear power plant workers was much smaller than that of hospital workers. These findings suggest that there may be apparent dose-rate effects. Conclusion : In population exposed to chronic low-dose radiation, the GPA assay has a potential to be used as an effective biologic marker for assessing the bone marrow cumulative exposure dose.

  • PDF

Relationship Between Non-alcoholic Fatty Liver Disease and Decreased Bone Mineral Density: A Retrospective Cohort Study in Korea

  • Sung, Jisun;Ryu, Seungho;Song, Yun-Mi;Cheong, Hae-Kwan
    • Journal of Preventive Medicine and Public Health
    • /
    • v.53 no.5
    • /
    • pp.342-352
    • /
    • 2020
  • Objectives: The aim of this retrospective cohort study was to investigate whether non-alcoholic fatty liver disease (NAFLD) was associated with incident bone mineral density (BMD) decrease. Methods: This study included 4536 subjects with normal BMD at baseline. NAFLD was defined as the presence of fatty liver on abdominal ultrasonography without significant alcohol consumption or other causes. Decreased BMD was defined as a diagnosis of osteopenia, osteoporosis, or BMD below the expected range for the patient's age based on dual-energy X-ray absorptiometry. Cox proportional hazards models were used to estimate the hazard ratio of incident BMD decrease in subjects with or without NAFLD. Subgroup analyses were conducted according to the relevant factors. Results: Across 13 354 person-years of total follow-up, decreased BMD was observed in 606 subjects, corresponding to an incidence of 45.4 cases per 1000 person-years (median follow-up duration, 2.1 years). In the model adjusted for age and sex, the hazard ratio was 0.65 (95% confidence interval, 0.51 to 0.82), and statistical significance disappeared after adjustment for body mass index (BMI) and cardiometabolic factors. In the subgroup analyses, NAFLD was associated with a lower risk of incident BMD decrease in females even after adjustment for confounders. The direction of the effect of NAFLD on the risk of BMD decrease changed depending on BMI category and body fat percentage, although the impact was statistically insignificant. Conclusions: NAFLD had a significant protective effect on BMD in females. However, the effects may vary depending on BMI category or body fat percentage.

A Numerical Study on CUSUM Test for Volatility Shifts Against Long-Range Dependence (변동성 변화와 장기억성을 구분하는 CUSUM 검정통계량에 대한 실증분석)

  • Lee, Youngsun;Lee, Taewook
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.2
    • /
    • pp.291-305
    • /
    • 2014
  • Persistence is one of the typical characteristics appearing in the volatility of financial time series. According to the recent researches, the volatility persistence may be due to either volatility shifts or long-range dependence. In this paper, we consider residual-based CUSUM tests to distinguish volatility persistence, long-range dependence and volatility shifts in GARCH models. It is observed that this test procedure achieve reasonable powers without a size distortion. Moreover, we employ AIC and BIC criteria to estimate the change points and the number of change points in volatility. We demonstrate the superiority of residual-based CUSUM tests on various Monte Carlo simulations and empirical data analysis.

A generalized likelihood ratio chart for monitoring type I right-censored Weibull lifetimes (제1형 우측중도절단된 와이블 수명자료를 모니터링하는 GLR 관리도)

  • Han, Sung Won;Lee, Jaeheon
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.647-663
    • /
    • 2017
  • Weibull distribution is a popular distribution for modeling lifetimes because it reflects the characteristics of failure adequately and it models either increasing or decreasing failure rates simply. It is a standard method of the lifetimes test to wait until all samples failed; however, censoring can occur due to some realistic limitations. In this paper, we propose a generalized likelihood ratio (GLR) chart to monitor changes in the scale parameter for type I right-censored Weibull lifetime data. We also compare the performance of the proposed GLR chart with two CUSUM charts proposed earlier using average run length (ARL). Simulation results show that the Weibull GLR chart is effective to detect a wide range of shift sizes when the shape parameter and sample size are large and the censoring rate is not too high.

Estimation of smooth monotone frontier function under stochastic frontier model (확률프런티어 모형하에서 단조증가하는 매끄러운 프런티어 함수 추정)

  • Yoon, Danbi;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.5
    • /
    • pp.665-679
    • /
    • 2017
  • When measuring productive efficiency, often it is necessary to have knowledge of the production frontier function that shows the maximum possible output of production units as a function of inputs. Canonical parametric forms of the frontier function were initially considered under the framework of stochastic frontier model; however, several additional nonparametric methods have been developed over the last decade. Efforts have been recently made to impose shape constraints such as monotonicity and concavity on the non-parametric estimation of the frontier function; however, most existing methods along that direction suffer from unnecessary non-smooth points of the frontier function. In this paper, we propose methods to estimate the smooth frontier function with monotonicity for stochastic frontier models and investigate the effect of imposing a monotonicity constraint into the estimation of the frontier function and the finite dimensional parameters of the model. Simulation studies suggest that imposing the constraint provide better performance to estimate the frontier function, especially when the sample size is small or moderate. However, no apparent gain was observed concerning the estimation of the parameters of the error distribution regardless of sample size.

A Prediction Model on Freeway Accident Duration using AFT Survival Analysis (AFT 생존분석 기법을 이용한 고속도로 교통사고 지속시간 예측모형)

  • Jeong, Yeon-Sik;Song, Sang-Gyu;Choe, Gi-Ju
    • Journal of Korean Society of Transportation
    • /
    • v.25 no.5
    • /
    • pp.135-148
    • /
    • 2007
  • Understanding the relation between characteristics of an accident and its duration is crucial for the efficient response of accidents and the reduction of total delay caused by accidents. Thus the objective of this study is to model accident duration using an AFT metric model. Although the log-logistic and log-normal AFT models were selected based on the previous studies and statistical theory, the log-logistic model was better fitted. Since the AFT model is commonly used for the purpose of prediction, the estimated model can be also used for the prediction of duration on freeways as soon as the base accident information is reported. Therefore, the predicted information will be directly useful to make some decisions regarding the resources needed to clear accident and dispatch crews as well as will lead to less traffic congestion and much saving the injured.

Outbound Air Travel Demand Forecasting Model with Unobserved Regional Characteristics (미관찰 지역 특성을 고려한 내국인 국제선 항공수요 추정 모형)

  • YU, Jeong Whon;CHOI, Jung Yoon
    • Journal of Korean Society of Transportation
    • /
    • v.36 no.2
    • /
    • pp.141-154
    • /
    • 2018
  • In order to meet the ever-increasing demand for international air travel, several plans are underway to open new airports and expand existing provincial airports. However, existing air demand forecasts have been based on the total air demand in Korea or the air demand among major cities. There is not much forecast of regional air demand considering local characteristics. In this study, the outbound air travel demand in the southeastern region of Korea was analyzed and the fixed-effects model using panel data was proposed as an optimal model that can reflect the inherent characteristics of metropolitan areas which are difficult to observe in reality. The results of model validation show that panel data analysis effectively addresses the spurious regression and unobserved heterogeneity that are difficult to handle in a model using only a few macroeconomic indicators with time series characteristics. Various statistical validation and conformance tests suggest that the fixed-effects model proposed in this study is superior to other econometric models in predicting demand for international demand in the southeastern region.