• 제목/요약/키워드: bayesian test

검색결과 245건 처리시간 0.023초

Identifying differentially expressed genes using the Polya urn scheme

  • Saraiva, Erlandson Ferreira;Suzuki, Adriano Kamimura;Milan, Luis Aparecido
    • Communications for Statistical Applications and Methods
    • /
    • 제24권6호
    • /
    • pp.627-640
    • /
    • 2017
  • A common interest in gene expression data analysis is to identify genes that present significant changes in expression levels among biological experimental conditions. In this paper, we develop a Bayesian approach to make a gene-by-gene comparison in the case with a control and more than one treatment experimental condition. The proposed approach is within a Bayesian framework with a Dirichlet process prior. The comparison procedure is based on a model selection procedure developed using the discreteness of the Dirichlet process and its representation via Polya urn scheme. The posterior probabilities for models considered are calculated using a Gibbs sampling algorithm. A numerical simulation study is conducted to understand and compare the performance of the proposed method in relation to usual methods based on analysis of variance (ANOVA) followed by a Tukey test. The comparison among methods is made in terms of a true positive rate and false discovery rate. We find that proposed method outperforms the other methods based on ANOVA followed by a Tukey test. We also apply the methodologies to a publicly available data set on Plasmodium falciparum protein.

Determining the adjusting bias in reactor pressure vessel embrittlement trend curve using Bayesian multilevel modelling

  • Gyeong-Geun Lee;Bong-Sang Lee;Min-Chul Kim;Jong-Min Kim
    • Nuclear Engineering and Technology
    • /
    • 제55권8호
    • /
    • pp.2844-2853
    • /
    • 2023
  • A sophisticated Bayesian multilevel model for estimating group bias was developed to improve the utility of the ASTM E900-15 embrittlement trend curve (ETC) to assess the conditions of nuclear power plants (NPPs). For multilevel model development, the Baseline 22 surveillance dataset was basically classified into groups based on the NPP name, product form, and notch orientation. By including the notch direction in the grouping criteria, the developed model could account for TTS differences among NPP groups with different notch orientations, which have not been considered in previous ETCs. The parameters of the multilevel model and biases of the NPP groups were calculated using the Markov Chain Monte Carlo method. As the number of data points within a group increased, the group bias approached the mean residual, resulting in reduced credible intervals of the mean, and vice versa. Even when the number of surveillance test data points was less than three, the multilevel model could estimate appropriate biases without overfitting. The model also allowed for a quantitative estimate of the changes in the bias and prediction interval that occurred as a result of adding more surveillance test data. The biases estimated through the multilevel model significantly improved the performance of E900-15.

Complex Segregation Analysis of Categorical Traits in Farm Animals: Comparison of Linear and Threshold Models

  • Kadarmideen, Haja N.;Ilahi, H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제18권8호
    • /
    • pp.1088-1097
    • /
    • 2005
  • Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.

시맨틱 기술과 베이시안 네트워크를 이용한 산사태 취약성 분석 (Landslide Susceptibility Analysis Using Bayesian Network and Semantic Technology)

  • 이상훈
    • 대한공간정보학회지
    • /
    • 제18권4호
    • /
    • pp.61-69
    • /
    • 2010
  • 비탈면 혹은 절성토지의 파괴로 사람과 재산에 심각한 피해를 입히기 때문에 미리 산사태 취약성 분석을 수행하여 개발 혹은 자연재해로부터 위험을 대비하는 것이 필요하다. 기존의 산사태 취약성 분석은 휴리스틱, 통계학적, 결정론적 혹은 확률론적 방법을 통해 이뤄졌다. 그러나, 적은 현장정보 등으로 분석의 신뢰도가 떨어지거나, 전문가의 경험과 지식을 기존 정량적인 해석모델에 반영하기 어려웠다. 본 연구는 산사태 취약성 분석에 대한 전문가 지식과 공간입력자료의 시맨틱을 추출하여 온톨로지 모델을 구축하고, 이를 베이시안 네트워크에 반영하여 확률적인 산사태 모델링을 제안하였다. 기존에 전문가 수작업으로 이뤄지던 베이시안 네트워크의 구조 생성을 온톨로지 모델의 지식추론으로 자동화하고, 현장정보뿐만 아니라 전문가 지식을 모델링에 반영하여 조건부 산사태 발생확률분포를 작성하였다. 이 결과를 GIS에 적용하여 산사태 취약성 지도를 작성하였다. 검증을 위해 충남 홍성일원의 오서산 지역에 적용한 결과 기존 산사태 발생흔적과 86.5% 일치하였다. 본 연구를 통해 일반 사용자도 전문가 도움 없이도 광역적인 산사태 취약성 분석이 가능하리라 기대된다.

Combining Geostatistical Indicator Kriging with Bayesian Approach for Supervised Classification

  • Park, No-Wook;Chi, Kwang-Hoon;Moon, Wooil-M.;Kwon, Byung-Doo
    • 대한원격탐사학회:학술대회논문집
    • /
    • 대한원격탐사학회 2002년도 Proceedings of International Symposium on Remote Sensing
    • /
    • pp.382-387
    • /
    • 2002
  • In this paper, we propose a geostatistical approach incorporated to the Bayesian data fusion technique for supervised classification of multi-sensor remote sensing data. Traditional spectral based classification cannot account for the spatial information and may result in unrealistic classification results. To obtain accurate spatial/contextual information, the indicator kriging that allows one to estimate the probability of occurrence of classes on the basis of surrounding observations is incorporated into the Bayesian framework. This approach has its merit incorporating both the spectral information and spatial information and improves the confidence level in the final data fusion task. To illustrate the proposed scheme, supervised classification of multi-sensor test remote sensing data set was carried out.

  • PDF

Classical and Bayesian methods of estimation for power Lindley distribution with application to waiting time data

  • Sharma, Vikas Kumar;Singh, Sanjay Kumar;Singh, Umesh
    • Communications for Statistical Applications and Methods
    • /
    • 제24권3호
    • /
    • pp.193-209
    • /
    • 2017
  • The power Lindley distribution with some of its properties is considered in this article. Maximum likelihood, least squares, maximum product spacings, and Bayes estimators are proposed to estimate all the unknown parameters of the power Lindley distribution. Lindley's approximation and Markov chain Monte Carlo techniques are utilized for Bayesian calculations since posterior distribution cannot be reduced to standard distribution. The performances of the proposed estimators are compared based on simulated samples. The waiting times of research articles to be accepted in statistical journals are fitted to the power Lindley distribution with other competing distributions. Chi-square statistic, Kolmogorov-Smirnov statistic, Akaike information criterion and Bayesian information criterion are used to access goodness-of-fit. It was found that the power Lindley distribution gives a better fit for the data than other distributions.

Bayesian Tomographic 재구성에 있어서 Gibbs Smoothing Priors의 효과에 대한 비교연구 (A Comparative Study of the Effects of Gibbs Smoothing Priors in Bayesian Tomographic Reconstruction)

  • 이수진
    • 대한의용생체공학회:학술대회논문집
    • /
    • 대한의용생체공학회 1997년도 춘계학술대회
    • /
    • pp.279-282
    • /
    • 1997
  • Bayesian reconstruction methods for emission computed tomography have been a topic of interest in recent years, partly because they allow for the introduction of prior information into the reconstruction problem. Early formulations incorporated priors that imposed simple spatial smoothness constraints on the underlying object using Gibbs priors in the form of four-nearest or eight-nearest neighbors. While these types of priors, known as "membrane" priors, are useful as stabilizers in otherwise unstable ML-EM reconstructions, more sophisticated prior models are needed to model underlying source distributions more accurately. In this work, we investigate whether the "thin plate" model has advantages over the simple Gibbs smoothing priors mentioned above. To test and compare quantitative performance of the reconstruction algorithms, we use Monte Carlo noise trials and calculate bias and variance images of reconstruction estimates. The conclusion is that the thin plate prior outperforms the membrane prior in terms of bias and variance.

  • PDF

Independent Testing in Marshall and Olkin's Bivariate Exponential Model Using Fractional Bayes Factor Under Bivariate Type I Censorship

  • Cho, Kil-Ho;Cho, Jang-Sik;Choi, Seung-Bae
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권4호
    • /
    • pp.1391-1396
    • /
    • 2008
  • In this paper, we consider two components system which the lifetimes have Marshall and Olkin's bivariate exponential model with bivariate type I censored data. We propose a Bayesian independent test procedure for above model using fractional Bayes factor method by O'Hagan based on improper prior distributions. And we compute the fractional Bayes factor and the posterior probabilities for the hypotheses, respectively. Also we select a hypothesis which has the largest posterior probability. Finally a numerical example is given to illustrate our Bayesian testing procedure.

  • PDF

An objective Bayesian analysis for multiple step stress accelerated life tests

  • Kim, Dal-Ho;Kang, Sang-Gil;Lee, Woo-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권3호
    • /
    • pp.601-614
    • /
    • 2009
  • This paper derives noninformative priors for scale parameter of exponential distribution when the data are collected in multiple step stress accelerated life tests. We nd the objective priors for this model and show that the reference prior satisfies first order matching criterion. Also, we show that there exists no second order matching prior. Some simulation results are given and using artificial data, we perform Bayesian analysis for proposed priors.

  • PDF

Nomogram for screening the risk of developing metabolic syndrome using naïve Bayesian classifier

  • Minseok Shin;Jeayoung Lee
    • Communications for Statistical Applications and Methods
    • /
    • 제30권1호
    • /
    • pp.21-35
    • /
    • 2023
  • Metabolic syndrome is a serious disease that can eventually lead to various complications, such as stroke and cardiovascular disease. In this study, we aimed to identify the risk factors related to metabolic syndrome for its prevention and recognition and propose a nomogram that visualizes and predicts the probability of the incidence of metabolic syndrome. We conducted an analysis using data from the Korea National Health and Nutrition Survey (KNHANES VII) and identified 10 risk factors affecting metabolic syndrome by using the Rao-Scott chi-squared test, considering the characteristics of the complex sample. A naïve Bayesian classifier was used to build a nomogram for metabolic syndrome. We then predicted the incidence of metabolic syndrome using the nomogram. Finally, we verified the nomogram using a receiver operating characteristic curve and a calibration plot.