• 제목/요약/키워드: pseudo-statistic approach

검색결과 3건 처리시간 0.019초

Multiple Testing in Genomic Sequences Using Hamming Distance

  • Kang, Moonsu
    • Communications for Statistical Applications and Methods
    • /
    • 제19권6호
    • /
    • pp.899-904
    • /
    • 2012
  • High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.

Improved Statistical Testing of Two-class Microarrays with a Robust Statistical Approach

  • Oh, Hee-Seok;Jang, Dong-Ik;Oh, Seung-Yoon;Kim, Hee-Bal
    • Interdisciplinary Bio Central
    • /
    • 제2권2호
    • /
    • pp.4.1-4.6
    • /
    • 2010
  • The most common type of microarray experiment has a simple design using microarray data obtained from two different groups or conditions. A typical method to identify differentially expressed genes (DEGs) between two conditions is the conventional Student's t-test. The t-test is based on the simple estimation of the population variance for a gene using the sample variance of its expression levels. Although empirical Bayes approach improves on the t-statistic by not giving a high rank to genes only because they have a small sample variance, the basic assumption for this is same as the ordinary t-test which is the equality of variances across experimental groups. The t-test and empirical Bayes approach suffer from low statistical power because of the assumption of normal and unimodal distributions for the microarray data analysis. We propose a method to address these problems that is robust to outliers or skewed data, while maintaining the advantages of the classical t-test or modified t-statistics. The resulting data transformation to fit the normality assumption increases the statistical power for identifying DEGs using these statistics.

The prediction of the critical factor of safety of homogeneous finite slopes subjected to earthquake forces using neural networks and multiple regressions

  • Erzin, Yusuf;Cetin, T.
    • Geomechanics and Engineering
    • /
    • 제6권1호
    • /
    • pp.1-15
    • /
    • 2014
  • In this study, artificial neural network (ANN) and multiple regression (MR) models were developed to predict the critical factor of safety ($F_s$) of the homogeneous finite slopes subjected to earthquake forces. To achieve this, the values of $F_s$ in 5184 nos. of homogeneous finite slopes having different slope, soil and earthquake parameters were calculated by using the Simplified Bishop method and the minimum (critical) $F_s$ for each of the case was determined and used in the development of the ANN and MR models. The results obtained from both the models were compared with those obtained from the calculations. It is found that the ANN model exhibits more reliable predictions than the MR model. Moreover, several performance indices such as the determination coefficient, variance account for, mean absolute error, root mean square error, and the scaled percent error were computed. Also, the receiver operating curves were drawn, and the areas under the curves (AUC) were calculated to assess the prediction capacity of the ANN and MR models developed. The performance level attained in the ANN model shows that the ANN model developed can be used for predicting the critical $F_s$ of the homogeneous finite slopes subjected to earthquake forces.