• Title/Summary/Keyword: data distributions

Search Result 2,607, Processing Time 0.038 seconds

Bayesian Test for the Difference of Exponential Guarantee Time Parameters

  • Kang, Sang-Gil;Kim, Dal-Ho;Lee, Woo-Dong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.1095-1106
    • /
    • 2005
  • When X and Y have independent two parameter exponential distributions, we develop a Bayesian testing procedures for the equality of two location parameters. The reference prior in non-regular exponential model is derived. Under this reference prior, we propose a Bayesian test procedures for the equality of two location parameters using fractional Bayes factor and intrinsic Bayes factor. Simulation study and some real data examples are provided.

  • PDF

Transactions Clustering based on Item Similarity (아이템의 유사도를 고려한 트랜잭션 클러스터링)

  • 이상욱;김재련
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2002.11a
    • /
    • pp.250-257
    • /
    • 2002
  • Clustering is a data mining method, which consists in discovering interesting data distributions in very large databases. In traditional data clustering, similarity of a cluster of object is measured by pairwise similarity of objects in that paper. In view of the nature of clustering transactions, we devise in this paper a novel measurement called item similarity and utilize this to perform clustering. With this item similarity measurement, we develop an efficient clustering algorithm for target marketing in each group.

  • PDF

NONPARAMETRIC ONE-SIDED TESTS FOR MULTIVARIATE AND RIGHT CENSORED DATA

  • Park, Hyo-Il;Na, Jong-Hwa
    • Journal of the Korean Statistical Society
    • /
    • v.32 no.4
    • /
    • pp.373-384
    • /
    • 2003
  • In this paper, we formulate multivariate one-sided alternatives and propose a class of nonparametric tests for possibly right censored data. We obtain the asymptotic tail probability (or p-value) by showing that our proposed test statistics have asymptotically multivariate normal distributions. Also, we illustrate our procedure with an example and compare it with other procedures in terms of empirical powers for the bivariate case. Finally, we discuss some properties of our test.

Bayesian Analysis for Multiple Capture-Recapture Models using Reference Priors

  • Younshik;Pongsu
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.165-178
    • /
    • 2000
  • Bayesian methods are considered for the multiple caputure-recapture data. Reference priors are developed for such model and sampling-based approach through Gibbs sampler is used for inference from posterior distributions. Furthermore approximate Bayes factors are obtained for model selection between trap and nontrap response models. Finally one methodology is implemented for a capture-recapture model in generated data and real data.

  • PDF

Semiparametric Bayesian Regression Model for Multiple Event Time Data

  • Kim, Yongdai
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.509-518
    • /
    • 2002
  • This paper is concerned with semiparametric Bayesian analysis of the proportional intensity regression model of the Poisson process for multiple event time data. A nonparametric prior distribution is put on the baseline cumulative intensity function and a usual parametric prior distribution is given to the regression parameter. Also we allow heterogeneity among the intensity processes in different subjects by using unobserved random frailty components. Gibbs sampling approach with the Metropolis-Hastings algorithm is used to explore the posterior distributions. Finally, the results are applied to a real data set.

A Robust Heteroscadastic Test for ARCH Models

  • Kim, Sahm-Yeong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.441-447
    • /
    • 2004
  • Li and Mak (1994) developed a test statistic for detecting the non-linearity and the heteroscedasticity of the time series data. But it is well known that the test statistic may be very sensitive in case of heavy-tailed distributions of the errors. Jiang et al.(2001) suggested the robust method for ARCH models but the calculation procedures for the estimation are very complicated. We suggested the robust method based on Huber's function and our method works quite well rater than the Li and Mak(1994). Also our method is relatively easy to calculate the test statistic.

  • PDF

Tests for Uniformity : A Comparative Study

  • Rahman, Mezbahur;Chakrobartty, Shuvro
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.1
    • /
    • pp.211-218
    • /
    • 2004
  • The subject of assessing whether a data set is from a specific distribution has received a good deal of attention. This topic is critically important for uniform distributions. Several parametric tests are compared. These tests also can be used in testing randomness of a sample. Anderson-Darling $A^2$ statistic is found to be most powerful.

  • PDF

A Suggestion to Establish Statistical Treatment Guideline for Aircraft Manufacturer (국산 복합재료 시험데이터 처리지침 수립을 위한 제언)

  • Suh, Jangwon
    • Journal of Aerospace System Engineering
    • /
    • v.8 no.4
    • /
    • pp.39-43
    • /
    • 2014
  • This paper examines the statistical process that should be performed with caution in the composite material qualification and equivalency process, and describes statistically significant considerations on outlier finding and handling process, data pooling through normalization process, review for data distributions and design allowables determination process for structural analysis. Based on these considerations, the need for guidance on statistical process for aircraft manufacturers who use the composite material properties database are proposed.

Response Modeling with Semi-Supervised Support Vector Regression (준지도 지지 벡터 회귀 모델을 이용한 반응 모델링)

  • Kim, Dong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.9
    • /
    • pp.125-139
    • /
    • 2014
  • In this paper, I propose a response modeling with a Semi-Supervised Support Vector Regression (SS-SVR) algorithm. In order to increase the accuracy and profit of response modeling, unlabeled data in the customer dataset are used with the labeled data during training. The proposed SS-SVR algorithm is designed to be a batch learning to reduce the training complexity. The label distributions of unlabeled data are estimated in order to consider the uncertainty of labeling. Then, multiple training data are generated from the unlabeled data and their estimated label distributions with oversampling to construct the training dataset with the labeled data. Finally, a data selection algorithm, Expected Margin based Pattern Selection (EMPS), is employed to reduce the training complexity. The experimental results conducted on a real-world marketing dataset showed that the proposed response modeling method trained efficiently, and improved the accuracy and the expected profit.

Cluster analysis for Seoul apartment price using symbolic data (서울 아파트 매매가 자료의 심볼릭 데이터를 이용한 군집분석)

  • Kim, Jaejik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1239-1247
    • /
    • 2015
  • In this study, 64 administrative regions with high frequencies of apartment trade in Seoul, Korea are classified by the apartment sale price. To consider distributions of apartment price for each region as well as the mean of the price, the symbolic histogram-valued data approach is employed. Symbolic data include all types of data which have internal variation in themselves such as intervals, lists, histograms, distributions, and models, etc. As a result of the cluster analysis using symbolic histogram data, it is found that Gangnam, Seocho, and Songpa districts and regions near by those districts have relatively higher prices and larger dispersions. This result makes sense because those regions have good accessibility to downtown and educational environment.