• Title/Summary/Keyword: statistical inferences

Search Result 122, Processing Time 0.022 seconds

A Naive Multiple Imputation Method for Ignorable Nonresponse

  • Lee, Seung-Chun
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.399-411
    • /
    • 2004
  • A common method of handling nonresponse in sample survey is to delete the cases, which may result in a substantial loss of cases. Thus in certain situation, it is of interest to create a complete set of sample values. In this case, a popular approach is to impute the missing values in the sample by the mean or the median of responders. The difficulty with this method which just replaces each missing value with a single imputed value is that inferences based on the completed dataset underestimate the precision of the inferential procedure. Various suggestions have been made to overcome the difficulty but they might not be appropriate for public-use files where the user has only limited information for about the reasons for nonresponse. In this note, a multiple imputation method is considered to create complete dataset which might be used for all possible inferential procedures without misleading or underestimating the precision.

Predicting football scores via Poisson regression model: applications to the National Football League

  • Saraiva, Erlandson F.;Suzuki, Adriano K.;Filho, Ciro A.O.;Louzada, Francisco
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.4
    • /
    • pp.297-319
    • /
    • 2016
  • Football match predictions are of great interest to fans and sports press. In the last few years it has been the focus of several studies. In this paper, we propose the Poisson regression model in order to football match outcomes. We applied the proposed methodology to two national competitions: the 2012-2013 English Premier League and the 2015 Brazilian Football League. The number of goals scored by each team in a match is assumed to follow Poisson distribution, whose average reflects the strength of the attack, defense and the home team advantage. Inferences about all unknown quantities involved are made using a Bayesian approach. We calculate the probabilities of win, draw and loss for each match using a simulation procedure. Besides, also using simulation, the probability of a team qualifying for continental tournaments, being crowned champion or relegated to the second division is obtained.

The Effect of the Stress and the School Adjustment on the Ego-Resiliency of Juveniles in Correctional Facility (교정시설 청소년이 지각하는 스트레스와 학교생활적응이 자아탄력성에 미치는 영향)

  • Kim, Youn-Soon;Um, In-Sook
    • Journal of Families and Better Life
    • /
    • v.27 no.2
    • /
    • pp.99-109
    • /
    • 2009
  • The purpose of this study was to investigate the effect of the stress and the school adjustment on the ego-resiliency of juvenile. The sample size of this study is 283, which makes it possible to do statistical inferences. As statistical methods, multiple regression analysis and hierarchical regression analysis with SPSS 10.0 is used. The results were as follows: First, friends part, among the four sub-factors of stress, affect negatively ego-resiliency. Second, interest about school life, among the three sub-factors of school adjustment, affect positively ego-resiliency. Third, observation of school norm, among the three sub-factors of school adjustment, affect negatively ego-resiliency. Based of these results, this study suggested that how to elevate ego-resiliency of juvenile delinquencies.

Misleading Confidence Interval for Sum of Variances Calculated by PROC MIXED of SAS (PROC MIXED가 제시하는 분산의 합의 신뢰구간의 문제점)

  • 박동준
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.1
    • /
    • pp.145-151
    • /
    • 2004
  • PROC MIXED fits a variety of mixed models to data and enables one to use these fitted models to make statistical inferences about the data. However, the simulation study in this article shows that PROC MIXED using REML estimators provides one with a confidence interval, that does not keep the stated confidence coefficients, on sums of two variance components in the simple regression model with unbalanced nested error structure which is a mixed model.

Bayesian Multiple Change-Point Estimation and Segmentation

  • Kim, Jaehee;Cheon, Sooyoung
    • Communications for Statistical Applications and Methods
    • /
    • v.20 no.6
    • /
    • pp.439-454
    • /
    • 2013
  • This study presents a Bayesian multiple change-point detection approach to segment and classify the observations that no longer come from an initial population after a certain time. Inferences are based on the multiple change-points in a sequence of random variables where the probability distribution changes. Bayesian multiple change-point estimation is classifies each observation into a segment. We use a truncated Poisson distribution for the number of change-points and conjugate prior for the exponential family distributions. The Bayesian method can lead the unsupervised classification of discrete, continuous variables and multivariate vectors based on latent class models; therefore, the solution for change-points corresponds to the stochastic partitions of observed data. We demonstrate segmentation with real data.

On the Distribution and Its Properties of the Sum of a Normal and a Doubly Truncated Normal

  • Kim, Hea-Jung
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.2
    • /
    • pp.255-266
    • /
    • 2006
  • This paper proposes a class of distributions which is useful in making inferences about the sum of values from a normal and a doubly truncated normal distribution. It is seen that the class is associated with the conditional distributions of truncated bivariate normal. The salient features of the class are mathematical tractability and strict inclusion of the normal and the skew-normal laws. Further it includes a shape parameter, to some extent, controls the index of skewness so that the class of distributions will prove useful in other contexts. Necessary theories involved in deriving the class of distributions are provided and some properties of the class are also studied.

Preservice Secondary Mathematics Teachers' Statistical Literacy in Understanding of Sample (중등수학 예비교사들의 통계적 소양 : 표본 개념에 대한 이해를 중심으로)

  • Tak, Byungjoo;Ku, Na-Young;Kang, Hyun-Young;Lee, Kyeong-Hwa
    • The Mathematical Education
    • /
    • v.56 no.1
    • /
    • pp.19-39
    • /
    • 2017
  • Taking samples of data and using samples to make inferences about unknown populations are at the core of statistical investigations. So, an understanding of the nature of sample as statistical thinking is involved in the area of statistical literacy, since the process of a statistical investigation can turn out to be totally useless if we don't appreciate the part sampling plays. However, the conception of sampling is a scheme of interrelated ideas entailing many statistical notions such as repeatability, representativeness, randomness, variability, and distribution. This complexity makes many people, teachers as well as students, reason about statistical inference relying on their incorrect intuitions without understanding sample comprehensively. Some research investigated how the concept of a sample is understood by not only students but also teachers or preservice teachers, but we want to identify preservice secondary mathematics teachers' understanding of sample as the statistical literacy by a qualitative analysis. We designed four items which asked preservice teachers to write their understanding for sampling tasks including representativeness and variability. Then, we categorized the similar responses and compared these categories with Watson's statistical literacy hierarchy. As a result, many preservice teachers turned out to be lie in the low level of statistical literacy as they ignore contexts and critical thinking, expecially about sampling variability rather than sample representativeness. Moreover, the experience of taking statistics courses in university did not seem to make a contribution to development of their statistical literacy. These findings should be considered when design preservice teacher education program to promote statistics education.

A Study on Analysis of Likelihood Principle and its Educational Implications (우도원리에 대한 분석과 그에 따른 교육적 시사점에 대한 연구)

  • Park, Sun Yong;Yoon, Hyoung Seok
    • The Mathematical Education
    • /
    • v.55 no.2
    • /
    • pp.193-208
    • /
    • 2016
  • This study analyzes the likelihood principle and elicits an educational implication. As a result of analysis, this study shows that Frequentist and Bayesian interpret the principle differently by assigning different role to that principle from each other. While frequentist regards the principle as 'the principle forming a basis for statistical inference using the likelihood ratio' through considering the likelihood as a direct tool for statistical inference, Bayesian looks upon the principle as 'the principle providing a basis for statistical inference using the posterior probability' by looking at the likelihood as a means for updating. Despite this distinction between two methods of statistical inference, two statistics schools get clues to compromise in a regard of using frequency prior probability. According to this result, this study suggests the statistics education that is a help to building of students' critical eye by their comparing inferences based on likelihood and posterior probability in the learning and teaching of updating process from frequency prior probability to posterior probability.

A GEE approach for the semiparametric accelerated lifetime model with multivariate interval-censored data

  • Maru Kim;Sangbum Choi
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.4
    • /
    • pp.389-402
    • /
    • 2023
  • Multivariate or clustered failure time data often occur in many medical, epidemiological, and socio-economic studies when survival data are collected from several research centers. If the data are periodically observed as in a longitudinal study, survival times are often subject to various types of interval-censoring, creating multivariate interval-censored data. Then, the event times of interest may be correlated among individuals who come from the same cluster. In this article, we propose a unified linear regression method for analyzing multivariate interval-censored data. We consider a semiparametric multivariate accelerated failure time model as a statistical analysis tool and develop a generalized Buckley-James method to make inferences by imputing interval-censored observations with their conditional mean values. Since the study population consists of several heterogeneous clusters, where the subjects in the same cluster may be related, we propose a generalized estimating equations approach to accommodate potential dependence in clusters. Our simulation results confirm that the proposed estimator is robust to misspecification of working covariance matrix and statistical efficiency can increase when the working covariance structure is close to the truth. The proposed method is applied to the dataset from a diabetic retinopathy study.

Features of sample concepts in the probability and statistics chapters of Korean mathematics textbooks of grades 1-12 (초.중.고등학교 확률과 통계 단원에 나타난 표본개념에 대한 분석)

  • Lee, Young-Ha;Shin, Sou-Yeong
    • Journal of Educational Research in Mathematics
    • /
    • v.21 no.4
    • /
    • pp.327-344
    • /
    • 2011
  • This study is the first step for us toward improving high school students' capability of statistical inferences, such as obtaining and interpreting the confidence interval on the population mean that is currently learned in high school. We suggest 5 underlying concepts of 'discretion of contingency and inevitability', 'discretion of induction and deduction', 'likelihood principle', 'variability of a statistic' and 'statistical model', those are necessary to appreciate statistical inferences as a reliable arguing tools in spite of its occasional erroneous conclusions. We assume those 5 concepts above are to be gradually developing in their school periods and Korean mathematics textbooks of grades 1-12 were analyzed. Followings were found. For the right choice of solving methodology of the given problem, no elementary textbook but a few high school textbooks describe its difference between the contingent circumstance and the inevitable one. Formal definitions of population and sample are not introduced until high school grades, so that the developments of critical thoughts on the reliability of inductive reasoning could not be observed. On the contrary of it, strong emphasis lies on the calculation stuff of the sample data without any inference on the population prospective based upon the sample. Instead of the representative properties of a random sample, more emphasis lies on how to get a random sample. As a result of it, the fact that 'the random variability of the value of a statistic which is calculated from the sample ought to be inherited from the randomness of the sample' could neither be noticed nor be explained as well. No comparative descriptions on the statistical inferences against the mathematical(deductive) reasoning were found. Few explanations on the likelihood principle and its probabilistic applications in accordance with students' cognitive developmental growth were found. It was hard to find the explanation of a random variability of statistics and on the existence of its sampling distribution. It is worthwhile to explain it because, nevertheless obtaining the sampling distribution of a particular statistic, like a sample mean, is a very difficult job, mere noticing its existence may cause a drastic change of understanding in a statistical inference.

  • PDF