• Title/Summary/Keyword: Statistical data

Search Result 15,004, Processing Time 0.05 seconds

A Study on Statistical Thinking and developing Statistical thoughts (통계적 사고와 그 함양에 관한 연구)

  • Kim, Sang-Lyong
    • Education of Primary School Mathematics
    • /
    • v.12 no.1
    • /
    • pp.31-38
    • /
    • 2009
  • This paper aims to develop a program which cultivates statistical ability for elementary students. For this purpose, I examined the relationship between mathematical thinking and statistical thinking. I developed statistical programs including classification, discussion of data, generating statistical problem and project program. As result, this study suggests implications for further elementary statistical education.

  • PDF

Contents Analysis on the Internet Sites for Statistical Information

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.131-140
    • /
    • 2006
  • There are many statistical information sites as the use of internet is increased quickly in recent years. In this paper, we explore and analyze internet sites for statistical information such as statistical survey system, education, database, and terminology. And then we classify these sites to apply statistical information to some particular spheres easily. In so doing, this study result aims at enhancing our understanding of internet sites for statistical information.

  • PDF

A Comparison Study on Statistical Modeling Methods (통계모델링 방법의 비교 연구)

  • Noh, Yoojeong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.17 no.5
    • /
    • pp.645-652
    • /
    • 2016
  • The statistical modeling of input random variables is necessary in reliability analysis, reliability-based design optimization, and statistical validation and calibration of analysis models of mechanical systems. In statistical modeling methods, there are the Akaike Information Criterion (AIC), AIC correction (AICc), Bayesian Information Criterion, Maximum Likelihood Estimation (MLE), and Bayesian method. Those methods basically select the best fitted distribution among candidate models by calculating their likelihood function values from a given data set. The number of data or parameters in some methods are considered to identify the distribution types. On the other hand, the engineers in a real field have difficulties in selecting the statistical modeling method to obtain a statistical model of the experimental data because of a lack of knowledge of those methods. In this study, commonly used statistical modeling methods were compared using statistical simulation tests. Their advantages and disadvantages were then analyzed. In the simulation tests, various types of distribution were assumed as populations and the samples were generated randomly from them with different sample sizes. Real engineering data were used to verify each statistical modeling method.

Investigations into Coarsening Continuous Variables

  • Jeong, Dong-Myeong;Kim, Jay-J.
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.2
    • /
    • pp.325-333
    • /
    • 2010
  • Protection against disclosure of survey respondents' identifiable and/or sensitive information is a prerequisite for statistical agencies that release microdata files from their sample surveys. Coarsening is one of popular methods for protecting the confidentiality of the data. Grouped data can be released in the form of microdata or tabular data. Instead of releasing the data in a tabular form only, having microdata available to the public with interval codes with their representative values greatly enhances the utility of the data. It allows the researchers to compute covariance between the variables and build statistical models or to run a variety of statistical tests on the data. It may be conjectured that the variance of the interval data is lower that of the ungrouped data in the sense that the coarsened data do not have the within interval variance. This conjecture will be investigated using the uniform and triangular distributions. Traditionally, midpoint is used to represent all the values in an interval. This approach implicitly assumes that the data is uniformly distributed within each interval. However, this assumption may not hold, especially in the last interval of the economic data. In this paper, we will use three distributional assumptions - uniform, Pareto and lognormal distribution - in the last interval and use either midpoint or median for other intervals for wage and food costs of the Statistics Korea's 2006 Household Income and Expenditure Survey(HIES) data and compare these approaches in terms of the first two moments.

A Study of Association Rule Mining by Clustering through Data Fusion

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.927-935
    • /
    • 2007
  • Currently, Gyeongnam province is executing the social index survey every year to the provincials. But, this survey has the limit of the analysis as execution of the different survey per 3 year cycles. The solution of this problem is data fusion. Data fusion is the process of combining multiple data in order to provide information of tactical value to the user. But, data fusion doesn#t mean the ultimate result. Therefore, efficient analysis for the data fusion is also important. In this study, we present data fusion method of statistical survey data. Also, we suggest application methodology of association rule mining by clustering through data fusion of statistical survey data.

  • PDF

Review of statistical methods for survival analysis using genomic data

  • Lee, Seungyeoun;Lim, Heeju
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.41.1-41.12
    • /
    • 2019
  • Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains "censored" data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing "omics" data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.

Patterns of Data Analysis\ulcorner

  • Unwin, Antony
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.219-230
    • /
    • 2001
  • How do you carry out data analysis\ulcorner There are few texts and little theory. One approach could be to use a pattern language, an idea which has been successful in field as diverse as town planning and software engineering. Patterns for data analysis are defined and discussed, illustrated with examples.

  • PDF

Zooming Statistics: Inference across scales

  • Hannig, Jan;Marron, J.S.;Riedi, R.H.
    • Journal of the Korean Statistical Society
    • /
    • v.30 no.2
    • /
    • pp.327-345
    • /
    • 2001
  • New statistical methods are ended to analyzed data in a multi-scale way. Some multi-scale extensions of stand methods, including novel visualization using dynamic graphics are proposed. These tools are used to explore non-standard structure in internet traffic data.

  • PDF

Change Analysis with the Sample Fourier Coefficients

  • Jaehee Kim
    • Communications for Statistical Applications and Methods
    • /
    • v.3 no.1
    • /
    • pp.207-217
    • /
    • 1996
  • The problem of detecting change with independent data is considered. The asymptotic distribution of the sample change process with the sample Fourier coefficients is shown as a Brownian Bridge process. We suggest to use dynamic statistics such as a sample Brownian Bridge and graphs as statistical animation. Graphs including change PP plots are given by way of illustration with the simulated data.

  • PDF

Statistical bioinformatics for gene expression data

  • Lee, Jae-K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.08a
    • /
    • pp.103-127
    • /
    • 2001
  • Gene expression studies require statistical experimental designs and validation before laboratory confirmation. Various clustering approaches, such as hierarchical, Kmeans, SOM are commonly used for unsupervised learning in gene expression data. Several classification methods, such as gene voting, SVM, or discriminant analysis are used for supervised lerning, where well-defined response classification is possible. Estimating gene-condition interaction effects require advanced, computationally-intensive statistical approaches.

  • PDF