• 제목/요약/키워드: normality assumption

검색결과 85건 처리시간 0.024초

붓스트랩을 활용한 이상원인변수의 탐지 기법 (Bootstrap-Based Fault Identification Method)

  • 강지훈;김성범
    • 품질경영학회지
    • /
    • 제39권2호
    • /
    • pp.234-243
    • /
    • 2011
  • Multivariate control charts are widely used to monitor the performance of a multivariate process over time to maintain control of the process. Although existing multivariate control charts provide control limits to monitor the process and detect any extraordinary events, it is a challenge to identify the causes of an out-of-control alarm when the number of process variables is large. Several fault identification methods have been developed to address this issue. However, these methods require a normality assumption of the process data. In the present study, we propose a bootstrapped-based $T^2$ decomposition technique that does not require any distributional assumption. A simulation study was conducted to examine the properties of the proposed fault identification method under various scenarios and compare it with the existing parametric $T^2$ decomposition method. The simulation results showed that the proposed method produced better results than the existing one, especially in nonnormal situations.

Improved Statistical Testing of Two-class Microarrays with a Robust Statistical Approach

  • Oh, Hee-Seok;Jang, Dong-Ik;Oh, Seung-Yoon;Kim, Hee-Bal
    • Interdisciplinary Bio Central
    • /
    • 제2권2호
    • /
    • pp.4.1-4.6
    • /
    • 2010
  • The most common type of microarray experiment has a simple design using microarray data obtained from two different groups or conditions. A typical method to identify differentially expressed genes (DEGs) between two conditions is the conventional Student's t-test. The t-test is based on the simple estimation of the population variance for a gene using the sample variance of its expression levels. Although empirical Bayes approach improves on the t-statistic by not giving a high rank to genes only because they have a small sample variance, the basic assumption for this is same as the ordinary t-test which is the equality of variances across experimental groups. The t-test and empirical Bayes approach suffer from low statistical power because of the assumption of normal and unimodal distributions for the microarray data analysis. We propose a method to address these problems that is robust to outliers or skewed data, while maintaining the advantages of the classical t-test or modified t-statistics. The resulting data transformation to fit the normality assumption increases the statistical power for identifying DEGs using these statistics.

불균형(不均衡) 일원(一元) 변량모형(變量模型)에서 추정방법(推定方法)에 따른 분산성분(分散成分)의 추정량(推定量)이 음(陰)이 될 확률(確率)의 계산(計算) (On the Probability of the Estimate of Variance Components that is Negative in Unbalanced One-Way Random Model)

  • 송규문
    • Journal of the Korean Data and Information Science Society
    • /
    • 제4권
    • /
    • pp.121-130
    • /
    • 1993
  • 불균형 일원 변량모형에서 AOV추정량과 사전값이 0, 1, ${\infty}$인 MINQUE에 국한하여 정규분포를 가정할 때 분산성분의 추정량이 음이 될 이론적 확률을 구하고, 비정규분포에 대해서는 모의실험을 통해 추정량이 음이 될 확률을 구하였다. 이 때 정차분포에서의 이론적 확률과 모의실험에 의해 계산된 확률간에 유의한 차이가 없고 표본수, 수준수 그리고 ${\rho}$가 커지면 각 추정량은 음이 될 확률이 작아지며, 고려된 추정량 중에서 AOV추정량이 대부분의 경우에 음이 될 확률이 가장 작게 나타났다.

  • PDF

텍스트 유사성을 위한 파라미터 및 비 파라미터 측정 (Parametric and Non Parametric Measures for Text Similarity)

  • 존 믈랴히루;김종남
    • 융합신호처리학회논문지
    • /
    • 제20권4호
    • /
    • pp.193-198
    • /
    • 2019
  • 인터넷상에서의 진짜 및 가짜 정보의 범람이 수많은 텍스트 분석에 대한 연구를 이끌었다. 문헌 표기 없이 타인의 저작물을 무단 복제 및 관련 없는 연구결과 조작 등이 한동안 세간의 주목을 이끌었다. 연구 분야에서 표절과 이의 대항 및 감소를 위해 다양한 도구들이 개발되었다. Pearson Spearman 본 연구에서는 코사인 유사성과 및 상관관계를 이용하는 파라미터 및 비 파라미터 방법을 이용하여 문장 유사성을 측정한다. Pearson 코사인 유사성과 상관관계는 가장 높은 유사성 계수를 얻었으나 Spearman 상관관계는 낮은 유사성 계수를 보여주었다. 본 논문에서는 정상성 가정과 편향성에 의존하는 파라미터 방법들에 반하도록 비정상성 가정으로 인한 문장 유사도를 측정하는 데 있어 비 파라미터 방법들을 사용하는 것을 제안한다.

Robust Bayesian Inference in Finite Population Sampling under Balanced Loss Function

  • Kim, Eunyoung;Kim, Dal Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제21권3호
    • /
    • pp.261-274
    • /
    • 2014
  • In this paper we develop Bayes and empirical Bayes estimators of the finite population mean with the assumption of posterior linearity rather than normality of the superpopulation under the balanced loss function. We compare the performance of the optimal Bayes estimator with ones of the classical sample mean and the usual Bayes estimator under the squared error loss with respect to the posterior expected losses, risks and Bayes risks when the underlying distribution is normal as well as when they are binomial and Poisson.

Estimation of Small Area Proportions Based on Logistic Mixed Model

  • Jeong, Kwang-Mo;Son, Jung-Hyun
    • 응용통계연구
    • /
    • 제22권1호
    • /
    • pp.153-161
    • /
    • 2009
  • We consider a logistic model with random effects as the superpopulation for estimating the small area pro-portions. The best linear unbiased predictor under linear mired model is popular in small area estimation. We use this type of estimator under logistic mixed motel for the small area proportions, on which the estimation of mean squared error is also discussed. Two kinds of estimation methods, the parametric bootstrap and the linear approximation will be compared through a Monte Carlo study in the respects of the normality assumption on the random effects distribution and also the magnitude of sample sizes on the approximation.

The Cumulants of the Non-normal t Distribution

  • Hwang, Hark
    • Journal of the Korean Statistical Society
    • /
    • 제5권2호
    • /
    • pp.91-100
    • /
    • 1976
  • The use of the statistic $t = \sqrt{n} (x-\mu)/S$, where $\bar{X) = \sum X_i/n, \mu = E(X_i), S^2 = \sum(X_i-\bar{X})^2/(n-1)$ in statistical inference is usually done under the assumption of normality of the population. If the population is not normally distributed the tabulated values of student t are no longer valid. The moments of t are obtained as a power series in $1/\sqar{n}$ whose coefficients are functions of the cumulants of X. The cumulants are obtained from the moments in the usual manner. The first eight cumulants of t are given up to terms of order $1/n^3$. The first eight cumulants of t are given up to terms of order $1/n^3$. These results extend those of Geary who gave the first six cumulants of t to order $1/n^2$.

  • PDF

Combining cluster analysis and neural networks for the classification problem

  • Kim, Kyungsup;Han, Ingoo
    • 한국경영과학회:학술대회논문집
    • /
    • 한국경영과학회 1996년도 추계학술대회발표논문집; 고려대학교, 서울; 26 Oct. 1996
    • /
    • pp.31-34
    • /
    • 1996
  • The extensive researches have compared the performance of neural networks(NN) with those of various statistical techniques for the classification problem. The empirical results of these comparative studies have indicated that the neural networks often outperform the traditional statistical techniques. Moreover, there are some efforts that try to combine various classification methods, especially multivariate discriminant analysis with neural networks. While these efforts improve the performance, there exists a problem violating robust assumptions of multivariate discriminant analysis that are multivariate normality of the independent variables and equality of variance-covariance matrices in each of the groups. On the contrary, cluster analysis alleviates this assumption like neural networks. We propose a new approach to classification problems by combining the cluster analysis with neural networks. The resulting predictions of the composite model are more accurate than each individual technique.

  • PDF

회귀분석을 이용한 저(低)레이놀즈수 익형 공력성능 연구 (Study on the Aerodynamic Performance of Low Reynolds Airfoils using a Regression Analysis)

  • 진원진
    • 항공우주시스템공학회지
    • /
    • 제10권3호
    • /
    • pp.9-14
    • /
    • 2016
  • Using a multiple regression analysis, a total of 78 low-Reynolds-number airfoils are examined in this paper to clarify the systematic relationships between the geometrical parameters of the airfoils and experimentally-determined aerodynamic coefficients. The results show that the effects of the maximum camber and the maximum thickness regarding the maximum lift and the stalling angle of attack, respectively, are major. The lower-surface flatness of the airfoil is also a crucial geometrical parameter for aerodynamic performance. It is proven here that, generally, the application of the regression equations for an assessment of the aerodynamic performance is relatively acceptable, along with an expectation that the lift-curve slope violates the normality assumption.

Monitoring of Gene Regulations Using Average Rank in DNA Microarray: Implementation of R

  • Park, Chang-Soon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권4호
    • /
    • pp.1005-1021
    • /
    • 2007
  • Traditional procedures for DNA microarray data analysis are to preprocess and normalize the gene expression data, and then to analyze the normalized data using statistical tests. Drawbacks of the traditional methods are: genuine biological signal may be unwillingly eliminated together with artifacts, the limited number of arrays per gene make statistical tests difficult to use the normality assumption or nonparametric method, and genes are tested independently without consideration of interrelationships among genes. A novel method using average rank in each array is proposed to eliminate such drawbacks. This average rank method monitors differentially regulated genes among genetically different groups and the selected genes are somewhat different from those selected by traditional P-value method. Addition of genes selected by the average rank method to the traditional method will provide better understanding of genetic differences of groups.

  • PDF