• Title/Summary/Keyword: non-normality

Search Result 105, Processing Time 0.022 seconds

A Test of Multivariate Normality Oriented for Testing Elliptical Symmetry

  • Park, Cheol-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.1
    • /
    • pp.221-231
    • /
    • 2006
  • A chi-squared test of multivariate normality is suggested which is oriented for detecting deviations from elliptical symmetry. We derive the limiting distribution of the test statistic via a central limit theorem on empirical processes. A simulation study is conducted to study the accuracy of the limiting distribution in finite samples. Finally, we compare the power of our method with those of other popular tests of multivariate normality under a non-normal distribution.

  • PDF

A Non-parametric Analysis of the Tam-Jin River : Data Homogeneity between Monitoring Stations (탐진강 수질측정 지점 간 동질성 검정을 위한 비모수적 자료 분석)

  • Kim, Mi-Ah;Lee, Su-Woong;Lee, Jae-Kwan;Lee, Jung-Sub
    • Journal of Korean Society on Water Environment
    • /
    • v.21 no.6
    • /
    • pp.651-658
    • /
    • 2005
  • The Non-parametric Analysis is powerful in data test especially for the non- normality water quality data. The data at three monitoring stations of the Tam-Jin River were evaluated for their normality using Skewness, Q-Q plot and Shapiro-Willks tests. Various constituent of water quality data including temperature, pH, DO, SS, BOD, COD, TN and TP in the period of January 1994 to December 2004 were used as dataset. Shapiro-Willks normality test was carried out for a test 5% significance level. Most water quality data except DO at monitoring stations 1 and 2 showed that data does not normally distributed. It is indicating that non-parametric method must be used for a water quality data. Therefore, a homogeneity was conducted by Mann-Whitney U test (p<0.05). Two stations were paired in three pairs of such stations. Differences between stations 1, 2 and stations 1, 3 for pH, BOD, COD, TN and TP were meaningful, but Tam-Jin 2 and 3 stations did not meaningful. In addition, a narrow gap of the water quality ranges is not a difference. Categories in which all three pairs of stations (1 and 2, 2 and 3, 1 and 3) in the Tam-Jin River showed difference in water quality were analyzed on TN and TP. The results of in this research suggest a right analysis in the homogeneity test of water quality data and a reasonable management of pollutant sources.

Further Applications of Johnson's SU-normal Distribution to Various Regression Models

  • Choi, Pilsun;Min, In-Sik
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.2
    • /
    • pp.161-171
    • /
    • 2008
  • This study discusses Johnson's $S_U$-normal distribution capturing a wide range of non-normality in various regression models. We provide the likelihood inference using Johnson's $S_U$-normal distribution, and propose a likelihood ratio (LR) test for normality. We also apply the $S_U$-normal distribution to the binary and censored regression models. Monte Carlo simulations are used to show that the LR test using the $S_U$-normal distribution can be served as a model specification test for normal error distribution, and that the $S_U$-normal maximum likelihood (ML) estimators tend to yield more reliable marginal effect estimates in the binary and censored model when the error distributions are non-normal.

Estimating Discriminatory Power with Non-normality and a Small Number of Defaults

  • Hong, C.S.;Kim, H.J.;Lee, J.L.
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.803-811
    • /
    • 2012
  • For credit evaluation models, we extend the study of discriminatory power based on AUC obtained from a ROC curve when the number of defaults is small and distribution functions of the defaults and non-defaults are normal distributions. Since distribution functions do not satisfy normality in real world, the distribution functions of the defaults and non-defaults are assumed as normal mixture distributions based on results that the normal mixture could be better fitted than other distribution estimation methods for non-normal data. By using several AUC statistics, the discriminatory power under such a circumstance is explored and compared with those of normal distributions.

A Study on Fatigue Analysis of Non-Gaussian Wide Band Process using Frequency-domain Method (주파수 영역 해석 기법을 이용한 비정규 광대역 과정의 피로해석에 관한 연구)

  • Kim, Hyeon-Jin;Jang, Beom-Seon
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.55 no.6
    • /
    • pp.466-473
    • /
    • 2018
  • Most frequency domain-based approaches assume that structural response should be a Gaussian random process. But a lot of non-Gaussian processes caused by multi-excitation and non-linearity in structural responses or load itself are observed in many real engineering problems. In this study, the effect of non-Normality on fatigue damages are discussed through case study. The accuracy of four frequency domain methods for non-Gaussian processes are compared in the case study. Power-law and Hermite models which are derived for non-Gaussian narrow-banded process tend to estimate fatigue damages less accurate than time domain results in small kurtosis and in case of large kurtosis they give conservative results. Weibull model seems to give conservative results in all environmental conditions considered. Among the four methods, Benascuitti-Tovo model for non-Gaussian process gives the best results in case study. This study could serve as background material for understanding the effect of non-normality on fatigue damages.

Parametric and Non Parametric Measures for Text Similarity (텍스트 유사성을 위한 파라미터 및 비 파라미터 측정)

  • Mlyahilu, John;Kim, Jong-Nam
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.20 no.4
    • /
    • pp.193-198
    • /
    • 2019
  • The wide spread of genuine and fake information on internet has lead to various studies on text analysis. Copying and pasting others' work without acknowledgement, research results manipulation without proof has been trending for a while in the era of data science. Various tools have been developed to reduce, combat and possibly eradicate plagiarism in various research fields. Text similarity measurements can be manually done by using both parametric and non parametric methods of which this study implements cosine similarity and Pearson correlation as parametric while Spearman correlation as non parametric. Cosine similarity and Pearson correlation metrics have achieved highest coefficients of similarity while Spearman shown low similarity coefficients. We recommend the use of non parametric methods in measuring text similarity due to their non normality assumption as opposed to the parametric methods which relies on normality assumptions and biasness.

A study on Robust Estimation of ARCH models

  • Kim, Sahm-Yeong;Hwang, Sun-Young
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.3-9
    • /
    • 2002
  • In financial time series, the autoregressive conditional heteroscedastic (ARCH) models have been widely used for modeling conditional variances. In many cases, non-normality or heavy-tailed distributions of the data have influenced the estimation methods under normality assumption. To solve this problem, a robust function for the conditional variances of the errors is proposed and compared the relative efficiencies of the estimators with other conventional models.

  • PDF

MOD M NORMALITY OF ${\beta}-EXPANSIONS$

  • Ahn, Young-Ho
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.9 no.2
    • /
    • pp.91-97
    • /
    • 2005
  • If ${\beta}\;>\;1$, then every non-negative number x has a ${\beta}-expansion$, i.e., $$x\;=\;{\epsilon}_0(x)\;+\;{\frac{\epsilon_1(x)}{\beta}}\;+\;{\frac{\epsilon_2(x)}{\beta}}\;+\;{\cdots}$$ where ${\epsilon}_0(x)\;=\;[x],\;{\epsilon}_1(x)\;=\;[\beta(x)],\;{\epsilon}_2(x)\;=\;[\beta(({\beta}x))]$, and so on ([x] denotes the integral part and (x) the fractional part of the real number x). Let T be a transformation on [0,1) defined by $x\;{\rightarrow}\;({\beta}x)$. It is well known that the relative frequency of $k\;{\in}\;\{0,\;1,\;{\cdots},\;[\beta]\}$ in ${\beta}-expansion$ of x is described by the T-invariant absolutely continuous measure ${\mu}_{\beta}$. In this paper, we show the mod M normality of the sequence $\{{\in}_n(x)\}$.

  • PDF

An Analysis of Panel Count Data from Multiple random processes

  • Park, You-Sung;Kim, Hee-Young
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2002.11a
    • /
    • pp.265-272
    • /
    • 2002
  • An Integer-valued autoregressive integrated (INARI) model is introduced to eliminate stochastic trend and seasonality from time series of count data. This INARI extends the previous integer-valued ARMA model. We show that it is stationary and ergodic to establish asymptotic normality for conditional least squares estimator. Optimal estimating equations are used to reflect categorical and serial correlations arising from panel count data and variations arising from three random processes for obtaining observation into estimation. Under regularity conditions for martingale sequence, we show asymptotic normality for estimators from the estimating equations. Using cancer mortality data provided by the U.S. National Center for Health Statistics (NCHS), we apply our results to estimate the probability of cells classified by 4 causes of death and 6 age groups and to forecast death count of each cell. We also investigate impact of three random processes on estimation.

  • PDF