• Title/Summary/Keyword: Performance-based Statistics

Search Result 1,048, Processing Time 0.023 seconds

A class of CUSUM tests using empirical distributions for tail changes in weakly dependent processes

  • Kim, JunHyeong;Hwang, Eunju
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.163-175
    • /
    • 2020
  • We consider a wide class of general weakly-dependent processes, called ψ-weak dependence, which unify almost all weak dependence structures of interest found in statistics under natural conditions on process parameters, such as mixing, association, Bernoulli shifts, and Markovian sequences. For detecting the tail behavior of the weakly dependent processes, change point tests are developed by means of cumulative sum (CUSUM) statistics with the empirical distribution functions of sample extremes. The null limiting distribution is established as a Brownian bridge. Its proof is based on the ψ-weak dependence structure and the existence of the phantom distribution function of stationary weakly-dependent processes. A Monte-Carlo study is conducted to see the performance of sizes and powers of the CUSUM tests in GARCH(1, 1) models; in addition, real data applications are given with log-returns of financial data such as the Korean stock price index.

High-dimensional linear discriminant analysis with moderately clipped LASSO

  • Chang, Jaeho;Moon, Haeseong;Kwon, Sunghoon
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.1
    • /
    • pp.21-37
    • /
    • 2021
  • There is a direct connection between linear discriminant analysis (LDA) and linear regression since the direction vector of the LDA can be obtained by the least square estimation. The connection motivates the penalized LDA when the model is high-dimensional where the number of predictive variables is larger than the sample size. In this paper, we study the penalized LDA for a class of penalties, called the moderately clipped LASSO (MCL), which interpolates between the least absolute shrinkage and selection operator (LASSO) and minimax concave penalty. We prove that the MCL penalized LDA correctly identifies the sparsity of the Bayes direction vector with probability tending to one, which is supported by better finite sample performance than LASSO based on concrete numerical studies.

Sparse vector heterogeneous autoregressive model with nonconvex penalties

  • Shin, Andrew Jaeho;Park, Minsu;Baek, Changryong
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.1
    • /
    • pp.53-64
    • /
    • 2022
  • High dimensional time series is gaining considerable attention in recent years. The sparse vector heterogeneous autoregressive (VHAR) model proposed by Baek and Park (2020) uses adaptive lasso and debiasing procedure in estimation, and showed superb forecasting performance in realized volatilities. This paper extends the sparse VHAR model by considering non-convex penalties such as SCAD and MCP for possible bias reduction from their penalty design. Finite sample performances of three estimation methods are compared through Monte Carlo simulation. Our study shows first that taking into cross-sectional correlations reduces bias. Second, nonconvex penalties performs better when the sample size is small. On the other hand, the adaptive lasso with debiasing performs well as sample size increases. Also, empirical analysis based on 20 multinational realized volatilities is provided.

Maximum product of spacings under a generalized Type-II progressive hybrid censoring scheme

  • Young Eun, Jeon;Suk-Bok, Kang;Jung-In, Seo
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.6
    • /
    • pp.665-677
    • /
    • 2022
  • This paper proposes a new estimation method based on the maximum product of spacings for estimating unknown parameters of the three-parameter Weibull distribution under a generalized Type-II progressive hybrid censoring scheme which guarantees a constant number of observations and an appropriate experiment duration. The proposed approach is appropriate for a situation where the maximum likelihood estimation is invalid, especially, when the shape parameter is less than unity. Furthermore, it presents the enhanced performance in terms of the bias through the Monte Carlo simulation. In particular, the superiority of this approach is revealed even under the condition where the maximum likelihood estimation satisfies the classical asymptotic properties. Finally, to illustrate the practical application of the proposed approach, the real data analysis is conducted, and the superiority of the proposed method is demonstrated through a simple goodness-of-fit test.

A Study on Fire Scenario Analysis Based on Fire Statistics for Building Fire Risk Analysis (건축물 화재위험평가를 위한 화재통계 기반 화재시나리오 분석에 관한 연구)

  • Jin, Seung-Hyeon;Kim, Hye-Won;Koo, In-Hyuk;Kwon, Yeong-Jin
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.04a
    • /
    • pp.81-82
    • /
    • 2022
  • This study aims to establish a methodology for rational fire risk assessment for building evacuation safety in case of fire, and specifically, to propose a fire risk assessment technique using fire scenarios considering various uncertain factors in case of fire. In order to analyze the extent to which the assumed conditions can occur, that is, the probability of each accident caused by fire, the safety rate is analyzed according to the presence or absence of each factor by using fire statistics. Factors related to the fire protection performance and evacuation ability of buildings are defined as disaster factors. In this study, disaster factors were classified into the following three categories.

  • PDF

Statistical Speech Feature Selection for Emotion Recognition

  • Kwon Oh-Wook;Chan Kwokleung;Lee Te-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.144-151
    • /
    • 2005
  • We evaluate the performance of emotion recognition via speech signals when a plain speaker talks to an entertainment robot. For each frame of a speech utterance, we extract the frame-based features: pitch, energy, formant, band energies, mel frequency cepstral coefficients (MFCCs), and velocity/acceleration of pitch and MFCCs. For discriminative classifiers, a fixed-length utterance-based feature vector is computed from the statistics of the frame-based features. Using a speaker-independent database, we evaluate the performance of two promising classifiers: support vector machine (SVM) and hidden Markov model (HMM). For angry/bored/happy/neutral/sad emotion classification, the SVM and HMM classifiers yield $42.3\%\;and\;40.8\%$ accuracy, respectively. We show that the accuracy is significant compared to the performance by foreign human listeners.

Effect of complex sample design on Pearson test statistic for homogeneity (복합표본자료에서 동질성검정을 위한 피어슨 검정통계량의 효과)

  • Heo, Sun-Yeong;Chung, Young-Ae
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.757-764
    • /
    • 2012
  • This research is for comparison of test statistics for homogeneity when the data is collected based on complex sample design. The survey data based on complex sample design does not satisfy the condition of independency which is required for the standard Pearson multinomial-based chi-squared test. Today, lots of data sets ara collected by complex sample designs, but the tests for categorical data are conducted using the standard Pearson chi-squared test. In this study, we compared the performance of three test statistics for homogeneity between two populations using data from the 2009 customer satisfaction evaluation survey to the service from Gyeongsangnam-do regional offices of education: the standard Pearson test, the unbiasedWald test, and the Pearsontype test with survey-based point estimates. Through empirical analyses, we fist showed that the standard Pearson test inflates the values of test statistics very much and the results are not reliable. Second, in the comparison of Wald test and Pearson-type test, we find that the test results are affected by the number of categories, the mean and standard deviation of the eigenvalues of design matrix.

Performance study of propensity score methods against regression with covariate adjustment

  • Park, Jincheol
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.217-227
    • /
    • 2015
  • In observational study, handling confounders is a primary issue in measuring treatment effect of interest. Historically, a regression with covariate adjustment (covariate-adjusted regression) has been the typical approach to estimate treatment effect incorporating potential confounders into model. However, ever since the introduction of the propensity score, covariate-adjusted regression has been gradually replaced in medical literatures with various balancing methods based on propensity score. On the other hand, there is only a paucity of researches assessing propensity score methods compared with the covariate-adjusted regression. This paper examined the performance of propensity score methods in estimating risk difference and compare their performance with the covariate-adjusted regression by a Monte Carlo study. The study demonstrated in general the covariate-adjusted regression with variable selection procedure outperformed propensity-score-based methods in terms both of bias and MSE, suggesting that the classical regression method needs to be considered, rather than the propensity score methods, if a performance is a primary concern.

A Note on the Efficiency Based Reliability Measures for Heterogeneous Populations

  • Cha, Ji-Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.2
    • /
    • pp.201-211
    • /
    • 2011
  • In many cases, populations in the real world are composed of different subpopulations. Furthermore, in addition to the heterogeneity in the lifetimes of items, there also could be the heterogeneity in the efficiency or performance of items. In this case, the reliability measures should be defined in a different way. In this article, we consider the mixture of stochastically ordered subpopulations. Efficiency based reliability measures are defined when the performance of items in the subpopulations has different levels. Discrete and continuous mixing models are studied. The concept of the association between the lifetime and the performance of items in subpopulations is defined. It is shown that the consideration of efficiency can change the shape of the mixture failure rate dramatically especially when the lifetime and the performance of items in subpopulations are negatively associated. Furthermore, the modelling method proposed in this paper is applied to the case when the stress levels of the operating environment of items are different.

Network-based regularization for analysis of high-dimensional genomic data with group structure (그룹 구조를 갖는 고차원 유전체 자료 분석을 위한 네트워크 기반의 규제화 방법)

  • Kim, Kipoong;Choi, Jiyun;Sun, Hokeun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.6
    • /
    • pp.1117-1128
    • /
    • 2016
  • In genetic association studies with high-dimensional genomic data, regularization procedures based on penalized likelihood are often applied to identify genes or genetic regions associated with diseases or traits. A network-based regularization procedure can utilize biological network information (such as genetic pathways and signaling pathways in genetic association studies) with an outstanding selection performance over other regularization procedures such as lasso and elastic-net. However, network-based regularization has a limitation because cannot be applied to high-dimension genomic data with a group structure. In this article, we propose to combine data dimension reduction techniques such as principal component analysis and a partial least square into network-based regularization for the analysis of high-dimensional genomic data with a group structure. The selection performance of the proposed method was evaluated by extensive simulation studies. The proposed method was also applied to real DNA methylation data generated from Illumina Innium HumanMethylation27K BeadChip, where methylation beta values of around 20,000 CpG sites over 12,770 genes were compared between 123 ovarian cancer patients and 152 healthy controls. This analysis was also able to indicate a few cancer-related genes.