• Title/Summary/Keyword: statistical approach

Search Result 2,335, Processing Time 0.026 seconds

EMPIRICAL BAYES THRESHOLDING: ADAPTING TO SPARSITY WHEN IT ADVANTAGEOUS TO DO SO

  • Silverman Bernard W.
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.1
    • /
    • pp.1-29
    • /
    • 2007
  • Suppose one is trying to estimate a high dimensional vector of parameters from a series of one observation per parameter. Often, it is possible to take advantage of sparsity in the parameters by thresholding the data in an appropriate way. A marginal maximum likelihood approach, within a suitable Bayesian structure, has excellent properties. For very sparse signals, the procedure chooses a large threshold and takes advantage of the sparsity, while for signals where there are many non-zero values, the method does not perform excessive smoothing. The scope of the method is reviewed and demonstrated, and various theoretical, practical and computational issues are discussed, in particularly exploring the wide potential and applicability of the general approach, and the way it can be used within more complex thresholding problems such as curve estimation using wavelets.

Multiple Testing in Genomic Sequences Using Hamming Distance

  • Kang, Moonsu
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.6
    • /
    • pp.899-904
    • /
    • 2012
  • High-dimensional categorical data models with small sample sizes have not been used extensively in genomic sequences that involve count (or discrete) or purely qualitative responses. A basic task is to identify differentially expressed genes (or positions) among a number of genes. It requires an appropriate test statistics and a corresponding multiple testing procedure so that a multivariate analysis of variance should not be feasible. A family wise error rate(FWER) is not appropriate to test thousands of genes simultaneously in a multiple testing procedure. False discovery rate(FDR) is better than FWER in multiple testing problems. The data from the 2002-2003 SARS epidemic shows that a conventional FDR procedure and a proposed test statistic based on a pseudo-marginal approach with Hamming distance performs better.

Autoregressive Cholesky Factor Modeling for Marginalized Random Effects Models

  • Lee, Keunbaik;Sung, Sunah
    • Communications for Statistical Applications and Methods
    • /
    • v.21 no.2
    • /
    • pp.169-181
    • /
    • 2014
  • Marginalized random effects models (MREM) are commonly used to analyze longitudinal categorical data when the population-averaged effects is of interest. In these models, random effects are used to explain both subject and time variations. The estimation of the random effects covariance matrix is not simple in MREM because of the high dimension and the positive definiteness. A relatively simple structure for the correlation is assumed such as a homogeneous AR(1) structure; however, it is too strong of an assumption. In consequence, the estimates of the fixed effects can be biased. To avoid this problem, we introduce one approach to explain a heterogenous random effects covariance matrix using a modified Cholesky decomposition. The approach results in parameters that can be easily modeled without concern that the resulting estimator will not be positive definite. The interpretation of the parameters is sensible. We analyze metabolic syndrome data from a Korean Genomic Epidemiology Study using this method.

A Sequential Approach for Estimating the Variance of a Normal Population Using Some Available Prior Information

  • Samawi, Hani M.;Al-Saleh, Mohammad F.
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.433-445
    • /
    • 2002
  • Using some available information about the unknown variance $\sigma$$^2$ of a normal distribution with mean $\mu$, a sequential approach is used to estimate $\sigma$$^2$. Two cases have been considered regarding the mean $\mu$ being known or unknown. The mean square error (MSE) of the new estimators are compared to that of the usual estimator of $\sigma$$^2$, namely, the sample variance based on a sample of size equal to the expected sample size. Simulation results indicates that, the new estimator is more efficient than the usual estimator of $\sigma$$^2$whenever the actual value of $\sigma$$^2$ is not too far from the prior information.

A Dual Problem of Calibration of Design Weights Based on Multi-Auxiliary Variables

  • Al-Jararha, J.
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.2
    • /
    • pp.137-146
    • /
    • 2015
  • Singh (2013) considered the dual problem to the calibration of design weights to obtain a new generalized linear regression estimator (GREG) for the finite population total. In this work, we have made an attempt to suggest a way to use the dual calibration of the design weights in case of multi-auxiliary variables; in other words, we have made an attempt to give an answer to the concern in Remark 2 of Singh (2013) work. The same idea is also used to generalize the GREG estimator proposed by Deville and S$\ddot{a}$rndal (1992). It is not an easy task to find the optimum values of the parameters appear in our approach; therefore, few suggestions are mentioned to select values for such parameters based on a random sample. Based on real data set and under simple random sampling without replacement design, our approach is compared with other approaches mentioned in this paper and for different sample sizes. Simulation results show that all estimators have negligible relative bias, and the multivariate case of Singh (2013) estimator is more efficient than other estimators.

New approach for analysis of progressive Type-II censored data from the Pareto distribution

  • Seo, Jung-In;Kang, Suk-Bok;Kim, Ho-Yong
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.5
    • /
    • pp.569-575
    • /
    • 2018
  • Pareto distribution is important to analyze data in actuarial sciences, reliability, finance, and climatology. In general, unknown parameters of the Pareto distribution are estimated based on the maximum likelihood method that may yield inadequate inference results for small sample sizes and high percent censored data. In this paper, a new approach based on the regression framework is proposed to estimate unknown parameters of the Pareto distribution under the progressive Type-II censoring scheme. The proposed method provides a new regression type estimator that employs the spacings of exponential progressive Type-II censored samples. In addition, the provided estimator is a consistent estimator with superior performance compared to maximum likelihood estimators in terms of the mean squared error and bias. The validity of the proposed method is assessed through Monte Carlo simulations and real data analysis.

A convenient approach for penalty parameter selection in robust lasso regression

  • Kim, Jongyoung;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.6
    • /
    • pp.651-662
    • /
    • 2017
  • We propose an alternative procedure to select penalty parameter in $L_1$ penalized robust regression. This procedure is based on marginalization of prior distribution over the penalty parameter. Thus, resulting objective function does not include the penalty parameter due to marginalizing it out. In addition, its estimating algorithm automatically chooses a penalty parameter using the previous estimate of regression coefficients. The proposed approach bypasses cross validation as well as saves computing time. Variable-wise penalization also performs best in prediction and variable selection perspectives. Numerical studies using simulation data demonstrate the performance of our proposals. The proposed methods are applied to Boston housing data. Through simulation study and real data application we demonstrate that our proposals are competitive to or much better than cross-validation in prediction, variable selection, and computing time perspectives.

Statistical Model-Based Voice Activity Detection Based on Second-Order Conditional MAP with Soft Decision

  • Chang, Joon-Hyuk
    • ETRI Journal
    • /
    • v.34 no.2
    • /
    • pp.184-189
    • /
    • 2012
  • In this paper, we propose a novel approach to statistical model-based voice activity detection (VAD) that incorporates a second-order conditional maximum a posteriori (CMAP) criterion. As a technical improvement for the first-order CMAP criterion in [1], we consider both the current observation and the voice activity decision in the previous two frames to take full consideration of the interframe correlation of voice activity. This is clearly different from the previous approach [1] in that we employ the voice activity decisions in the second-order (previous two frames) CMAP, which has quadruple thresholds with an additional degree of freedom, rather than the first-order (previous single frame). Also, a soft-decision scheme is incorporated, resulting in time-varying thresholds for further performance improvement. Experimental results show that the proposed algorithm outperforms the conventional CMAP-based VAD technique under various experimental conditions.

A Hybrid Approach Using Case-based Reasoning and Fuzzy Logic for Corporate Bond Rating

  • Kim, Hyun-jung;Shin, Kyung-shik
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.474-483
    • /
    • 2003
  • A number of studies for corporate bond rating classification problems have demonstrated that artificial intelligence approaches such as Case-based reasoning (CBR) can be alternative methodologies to statistical techniques. CBR is a problem solving technique in that the case specific knowledge of past experience is utilized to find a most similar solution to the new problems. To build a successful CBR system to deal with human information processing, the representation of knowledge of each attribute is an important key factor We propose a hybrid approach of using fuzzy sets that describe the approximate phenomena of the real world because it handles inexact knowledge represented by common linguistic terms in a similar way as human reasoning compared to the other existing techniques. Integration of fuzzy sets with CBR is important to develop effective methods for dealing with vague and incomplete knowledge to statistical represent using membership value of fuzzy sets in CBR.

  • PDF

A Local Influence Approach to Regression Diagnostics with Application to Robust Regression

  • Huh, Myung-Hoe;Park, Sung H.
    • Journal of the Korean Statistical Society
    • /
    • v.19 no.2
    • /
    • pp.151-159
    • /
    • 1990
  • Regression diagnostics often involves assesment of the changes that result from deleting multiple cases. Diagnostic mehtodology based on global influence measure, however, needs prohibitive computing time. As an alternative, Cook (1986) developed influence approach in which it is checked whether a minor modification of specifiation influences key results of an analysis. In line with Cook's development, we propose and study an inflence derivative method that yields both the magnitude and direction of case influences. The utility of our methodology is highlighted when case influence derivatives are plotted in a lower demensional space. Such plots are especially effective in unmasking "masked" observations in least squares regression and in robust regression also. We give several illustrations.strations.

  • PDF